[M3devel] LLVM backend?

Jay K jay.krell at cornell.edu
Fri Mar 29 17:09:09 CET 2013


This doesn't completely make sense to me.
But it is kind of what I expected and desire.


The debug format isn't the point.
parse.c should have no knowledge of dwarf vs. stabs vs. coff vs. xcoff
vs. codeview vs. vms vs. anything else.
It just happens that the authors found a private channel in stabs.
Ideally any debug format used by any C compiler can fully describe C,
and Modula-3.


The way it (any gcc frontend) is supposed to work is you describe
the types using a compiler-internal form, and that gets translated
into whatever the user requests that is supported.
The primary limit as to what gcc can do is probably the object file format.


Any backend without a private channel to the debugger, i.e. the C backend
as well as an LLVM backend, will have degraded debugging.


Perhaps a private channel can be found either way though.
Like, gcc does have this "tfile" thing where after compilation,
a separate tool goes over the .S file and makes changes.
Ideally we don't have intermediate .S files though.


Now, on Windows, for cdb/ntsd/windbg/kd, you can write debugger plugins.
.dlls loaded by the debugger that have full access to symbols.
They are often very helpful, and would surely help here.

I do wonder if it is worth changing TEXT for debuggability.
Like, say, to always be flat, no trees, and always nul terminated:
(actually I'd always write two zero bytes, in case of viewing the ascii as unicode)


#define TEXT_FLAGS_UNICODE (0x00000001)
#define TEXT_FLAGS_CONSTANT (0x00000002)


typedef struct {
    size_t flags;
    size_t length;
    union {
        char* ascii;
        wchar_t* unicode;
    } data;
} *TEXT;

 ? 

If we had merely that -- a more debugger/C-idiomatic representation
for TEXT, would m3gdb's advantages decline significantly?


I'd also be open to just plain typedef char* TEXT;
UTF8 encoded. Text.Length is slow, and Text.Concat also slower.
But ideally we'd store the length. The length could also be stored before the string.
On Windows, "BSTR"s work this way. i.e. it is viable and practical and in wide use.
Also, MFC/ATL CStringW/CStringA do this:
typedef struct {
  wchar_t* data;
} CStringW;


typedef struct {
  char* data;
} CStringA;

and then have an entire small struct before the data.
It holdes, as I recall, at least the length and a reference count.
The strings are reference counted and copy-on-write.
It is a nice implementation, but it isn't quite relevant here, since
our TEXTs are immutable and garbage collected.
The point is though, putting a struct in front of a char*/wchar_t* is viable.
As long as the garbage collector doesn't get confused.



Anyway, I think I'll start augmenting M3C.m3 to start writing .ll files also.
We'll see how that goes. I realize it isn't the ideal path.


Wrt nested variables/functions, I make a transform in the C backend.
The LLVM backend will need to make a similar transform.
I think the frontend should be willing/able to do this transform.
It would likely help.


 - Jay




> Date: Thu, 28 Mar 2013 10:23:17 -0500
> From: rodney_bates at lcwb.coop
> To: m3devel at elegosoft.com
> Subject: Re: [M3devel] LLVM backend?
> 
> 
> 
> On 03/28/2013 12:04 AM, Jay K wrote:
> > I was thinking of looking into LLVM more. For selfish reasons -- resume growth, but no matter.
> >
> > 1) Tony, are you making progress? Should I wait? Collaberate? Go it alone?
> 
> (Jay, I reversed the order of your paragraphs here, because my responses make more sense in that order.)
> 
>  > Actually I was just skimming "parse.c"..I don't remember exactly how I decided it is all a stabs-specific hack.
>  > Though, I do realize 1) typeids are encoded in identifier names, which certainly hurts debugging
>  > w/o m3gdb 2) types aren't being described as they ought to be.
>  >
> 
> Yes, it is very much a hack.  For one thing, stabs itself is something of a hack.  For another,
> a lot of the fields of stabs info are really treated as just containers for some extremely
> Modula-3-specific info, that, to be fair, isn't provided for in any reasonable way by
> true stabs.  Meanwhile, some true stabs stuff is produced too, but not used in m3gdb, because
> it isn't quite right or helpful.  parse.c and m3gdb are highly coupled by all this.  Stock gdb's
> code to read stabs is augmented by tons of stuff to further decode the M3-specific info
> inserted by parse.c, and tons more to interpret it.
> 
> Moreover, debug info and code to be translated effectively become diverging streams in
> parse.c, notwithstanding the fact that they are often interspersed.  This makes it very
> difficult for debug info to reflect anything gcc does to the code.  I believe the stabs
> info is mostly or entirely untouched after parse.c, though I haven't ever thoroughly
> vetted this.
> 
> One specific place where this particularly hurts right now is with nonlocal references,
> static links, etc.  With the advent of tree-nested.c, in later gcc versions, the runtime
> storage model is drastically reworked in gcc, after the stabs has already been produced.
> That seriously broke a lot of stuff I had working in m3gdb.  I have dabbled with Mickey-
> Mouse schemes to emit purely additional stabs info in tree-nested.c, without changing what
> was already there, then have m3gdb use it to selectively override the earlier-produced
> stabs.  I never finished this, and have only limited confidence it is even feasible.
> 
> > 2) An LLVM backend would I suspect have the same lack of m3gdb support as a C backend.
> > Does that bother people?
> 
> LLVM has a lot of already provided support for dwarf debug info, keeping it together with code,
> and helping to transform it in parallel with code, when optimizations, etc., are done.
> Meanwhile, dwarf itself is vastly more complete and appears to me, from superficial study,
> to be capable of representing all, or certainly most, of what is needed for good Modula-3 debugging.
> For these two reasons, I think LLVM plus dwarf present by far the best method to support
> a nice language-specific debugging experience, while leaving massive kludges behind.
> 
> This is one big reason why I support an LLVM back end.  It would indeed require significant
> work to get debug support.  But it would be so much easier and far more pleasant than
> the alternatives.  That includes even the alternative of further fixing the existing m3gdb/gcc,
> which still needs perhaps as much additional work as has already gone into it.  I am
> attached to it only because it now provides a lot more function than anything else
> we have (and which I use a lot).  Mucking around in M3ified stabs is, for me, strictly a
> destination, not a journey.
> 
> So, the short answer is, it bothers me less than any other option.
> 
> 
> 
> 
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20130329/39e07aac/attachment-0002.html>


More information about the M3devel mailing list