[M3devel] A rant about llvm and debug information
Rodney M. Bates
rodney_bates at lcwb.coop
Sun Mar 13 17:29:12 CET 2016
I want a better m3gdb, i.e., an interactive debugger for an executing program.
Even with excellent type safety, there are plenty of algorithmic bugs that
are not type violations. Then with large sets of source code and large sets
of data, a debugger is just so much faster than anything else.
As an example, I very recently fixed a long-standing compiler bug that Peter
reported. The compiler crashed after first correctly reporting a static
error on the code being compiled. Figuring out what was wrong and how
to fix all the affected cases without breaking any others required looking
at several spots in the compiler, all unfamiliar to me. Such as it is,
the current m3gdb helped immensely. For example, in many places, I saw
in the data structure, only the compiler's internal integer-mapped representation
of an identifier, I could quickly see the actual identifier with:
m3gdb> print M3ID.ToText(436)
But I have many pages of todo lists of fixes and improvements to make to
m3gdb. I had been thinking this would all be so much easier with Dwarf
debug info.
On 03/12/2016 03:02 PM, Darko Volaric wrote:
> Rodney can you tell me what your motivation for this sort of debugging support is? Is it for post mortem debugging, multi-language, external tool support or something else? M3 is so safe I've always envisaged an integrated (compiled-in) debugging tool which is in effect a call logger with some extras, I guess because that's my style of debugging. I'm wondering if there's another angle on what you want to achieve.
>
> - Darko
>
>
> On Fri, Mar 11, 2016 at 9:57 PM, Rodney M. Bates <rodney_bates at lcwb.coop <mailto:rodney_bates at lcwb.coop>> wrote:
>
> I have grown very disillusioned and discouraged about llvm. It does
> not seem to have lived up to a couple of its claims that were very
> important to what I am trying to do.
>
> The latest frustration is a recent discovery about its treatment of
> debug information and Dwarf. They say its internal information is
> loosely based on Dwarf, and the anecdotal things I had looked at in
> the past suggested it was isomorphic to Dwarf, with different
> low-level data structure. But I now find out, llvm only handles a
> very severe subset of Dwarf.
>
> The decisive example is the subrange node. Dwarf3 defines 18
> attributes for its DIE for a subrange type. Llvm will only handle
> two, the lower and upper bounds. Especially, it will not even handle
> a base type. This will make a Modula-3 debugger completely useless
> for anything having subrange type.
>
> I can't imagine what would have led to such a decision. Is there any
> language that has subranges, but they all implicitly have the same
> base type? It certainly won't support any language's subranges that I
> know of. Llvm evidently doesn't have the commitment to multiple
> languages that Dwarf has.
>
> My main motive for wanting an llvm back end for Modula-3 has always
> been a better Modula-3 debugger. Dwarf is a vastly superior debug
> information format than the highly non-standard stabs we are now
> using. I had imagined using llvm would be the way to get it.
>
> At this point, it seems predictable that there is a lot more of
> Dwarf's very extensive multi-language support that we need, but that
> llvm will not pipe through, yet to be discovered. There is no
> question that we would have to modify llvm to get any decent
> debugging, even parity with current m3cc/m3gdb. I had believed we
> could avoid forking and modifying llvm, but apparently not so.
>
> Which leads to my second disillusionment. Llvm is constantly
> undergoing very rapid change, and if you need or want to track the
> changes, the claims about well-documented formats and interfaces are,
> well, at least exaggerated.
>
> Recently, in the llvm mailing list, a couple of others who maintain
> things outside the official llvm tree have been seconding this, saying
> that important APIs constantly undergo extensive revisions, with no
> explanation other than just revised header files one must diff. E.g.,
> no suggestions what removed/altered functions/parameters should be
> replaced by. At least these posts give some confirmation that I am
> not just paranoid.
>
> Even if we could avoid actual forking by persuading llvm to
> incorporate changes for us, we would then have to constantly track the
> development head to get them. This in itself would entail so much
> time spent adapting, that I, for one, could hardly find time for any
> functional progress.
>
> I have been through one round of updating bindings for DIBuilder, and
> that was a nightmare. Whether looking at diffs in llvm headers and
> adapting our existing bindings, or starting over from scratch with the
> revised llvm headers, it is extremely tedious and error-prone.
> Moreover, since there is no intra-language type checking, many picky
> little errors will only show up as runtime assertion failures,
> segfaults, hard to explain behavior, etc. And this all has to happen
> many times over before we could get a single debugger that would
> handle both languages, making diagnosis all the harder.
>
> I have put a lot of work into this, and Peter obviously has put in a
> lot more. But at this point, it looks to be far more productive to
> abandon llvm/Dwarf debugging and put the energy into improving m3gdb,
> using/further extending the existing stabs.
>
> Or possibly modifying m3cc to produce Dwarf, but that raises a
> different issue.
>
> --
> Rodney Bates
> rodney.m.bates at acm.org <mailto:rodney.m.bates at acm.org>
> _______________________________________________
> M3devel mailing list
> M3devel at elegosoft.com <mailto:M3devel at elegosoft.com>
> https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
>
>
--
Rodney Bates
rodney.m.bates at acm.org
More information about the M3devel
mailing list