[M3devel] LLVM backend?

Fri Mar 29 17:58:54 CET 2013

For me, the best debugger architecture still remains the one
designed by D.R.Hanson:

http://research.microsoft.com/pubs/69690/tr-99-04.pdf

platform-independent and easy to implement.

From: Jay K 
Sent: Friday, March 29, 2013 5:09 PM
To: Rodney M. Bates ; m3devel 
Subject: Re: [M3devel] LLVM backend?

This doesn't completely make sense to me.
But it is kind of what I expected and desire.

The debug format isn't the point.
parse.c should have no knowledge of dwarf vs. stabs vs. coff vs. xcoff
vs. codeview vs. vms vs. anything else.
It just happens that the authors found a private channel in stabs.
Ideally any debug format used by any C compiler can fully describe C,
and Modula-3.

The way it (any gcc frontend) is supposed to work is you describe
the types using a compiler-internal form, and that gets translated
into whatever the user requests that is supported.
The primary limit as to what gcc can do is probably the object file format.

Any backend without a private channel to the debugger, i.e. the C backend
as well as an LLVM backend, will have degraded debugging.

Perhaps a private channel can be found either way though.
Like, gcc does have this "tfile" thing where after compilation,
a separate tool goes over the .S file and makes changes.
Ideally we don't have intermediate .S files though.

Now, on Windows, for cdb/ntsd/windbg/kd, you can write debugger plugins.
.dlls loaded by the debugger that have full access to symbols.
They are often very helpful, and would surely help here.

I do wonder if it is worth changing TEXT for debuggability.
Like, say, to always be flat, no trees, and always nul terminated:
(actually I'd always write two zero bytes, in case of viewing the ascii as unicode)

#define TEXT_FLAGS_UNICODE (0x00000001)
#define TEXT_FLAGS_CONSTANT (0x00000002)

typedef struct {
    size_t flags;
    size_t length;
    union {
        char* ascii;
        wchar_t* unicode;
    } data;
} *TEXT;

 ? 

If we had merely that -- a more debugger/C-idiomatic representation
for TEXT, would m3gdb's advantages decline significantly?

I'd also be open to just plain typedef char* TEXT;
UTF8 encoded. Text.Length is slow, and Text.Concat also slower.
But ideally we'd store the length. The length could also be stored before the string.
On Windows, "BSTR"s work this way. i.e. it is viable and practical and in wide use.
Also, MFC/ATL CStringW/CStringA do this:
typedef struct {
  wchar_t* data;
} CStringW;

typedef struct {
  char* data;
} CStringA;

and then have an entire small struct before the data.
It holdes, as I recall, at least the length and a reference count.
The strings are reference counted and copy-on-write.
It is a nice implementation, but it isn't quite relevant here, since
our TEXTs are immutable and garbage collected.
The point is though, putting a struct in front of a char*/wchar_t* is viable.
As long as the garbage collector doesn't get confused.

Anyway, I think I'll start augmenting M3C.m3 to start writing .ll files also.
We'll see how that goes. I realize it isn't the ideal path.

Wrt nested variables/functions, I make a transform in the C backend.
The LLVM backend will need to make a similar transform.
I think the frontend should be willing/able to do this transform.
It would likely help.

 - Jay

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20130329/802c2e7d/attachment-0002.html>