[M3devel] higher level m3cg?

Tue Aug 21 13:18:48 CEST 2012

*** A warning ***
Norman Ramsey's opinion (in stackoverflow) on possible compiler backends:

Code generation is my business :-)

Comments on a few options:

  a.. CLR: 

    a.. Pro: industrial support 
    b.. Con: you have to buy into their type system pretty much completely; depending on what you want to do with types, this may not matter 
    c.. Con: Only Windows platform is really prime-time quality
  b.. LLVM:

    a.. Pro: enthusiastic user community with charismatic leader 
    b.. Pro: serious backing from Apple 
    c.. Pro: many interesting performance improvements 
    d.. Con: somewhat complex interface 
    e.. Con: history of holes in the engineering; as LLVM matures expect the holes in the engineering to be plugged by adding to the complexity of the interface
  c.. C--

    a.. Pro: target is an actual written language, not an API; you can easily inspect, debug, and edit your C-- code 
    b.. Pro: design is reasonably mature and reasonably clean 
    c.. Pro: supports accurate garbage collection 
    d.. Pro: most users report it is very easy to use 
    e.. Con: very small development team 
    f.. Con: as of early 2009, supports only three hardware platforms (x86, PPC, ARM) 
    g.. Con: does not ship with a garbage collector 
    h.. Con: project has no future
  d.. C as target language

    a.. Pro: looks easy 
    b.. Con: nearly impossible to get decent performance 
    c.. Con: will drive you nuts in the long run; ask the long line of people who have tried to compile Haskell, ML, Modula-3, Scheme and more using this technique. At some point every one of these people gave up and built their own native code generator.
Summary: anything except C is a reasonable choice. For the best combination of flexibility, quality, and expected longevity, I'd probably recommend LLVM.

Full disclosure: I am affiliated with the C-- project.

From: Jay K 
Sent: Thursday, August 16, 2012 4:21 PM
To: m3devel 
Subject: [M3devel] higher level m3cg?

Should m3cg provide enough information for a backend to generate idiomatic C?
(What is idiomatic C? e.g. I'm ignoring loop constructs and exception handlinh..)

Should we make it so?

Or be pragmatic and see if anyone gets to that point?

But, look at this another way.
Let's say we are keeping the gcc backend.

Isn't it reasonable to have a better experience with stock gdb?

What should m3cg look like then?

Matching up m3front to gcc turns out to be "wierd".
As does having a backend generate "C".

In particular, "wierd" because there is a "level mismatch".

m3cg presents a fairly low level view of the program.
  It does layout. Global variables are stuffed into what you might call a "struct", with
no assigned field names. Field references are done by adding to addresses and casting.

Too low level to provide a "good" gcc tree representation or to generate "normal" C.

One might be able to, by somewhat extraordinary means, make due.
That is, specifically one could deduce field references from
offsets/sizes. But maybe it is reasonable for load/store
to include fields? Maybe in addition to what it provides?

As well, it appears to me, that

given TYPE Enum = {One, Two, Three};

the m3cg is like:

declare enum typeidblah
declare enum_elt One
declare enum_elt Two
declare enum_elt Three
declare_typename typeidblah Enum

One kind of instead wants more like:

declare enum typeidblah Enum
declare enum_elt One => rename it Enum_One
declare enum_elt Two ""
declare enum_elt Three ""

However I understand that {One, Two, Three} exists
as anonymous type independent of the name "Enum".

One could just as well have:
given TYPE Enum1 = {One, Two, Three};
given TYPE Enum2 = {One, Two, Three};

Enum1 and Enum2 probably have the same typeid, and are just
two typenames for the same type.

likewise:
given TYPE Enum1 = {One, Two, Three};
given TYPE Enum2 = Enum1;

but, pragmatically, in the interest of generating better C,
can we pass a name along with declare_enum?

I ask somewhat rhetorically. I realize there is the answer:
  enum Mtypeid { Mtypeid_One, Mtypeid_Two, Mtypeid_Three };
  typedef enum Mtypeid Enum1;

Also, enum variables I believe end up as just UINT8, 16, or 32.
Loads of enum values I believe end up as just loads of integers.
Can we pass along optional enum names with declare_local/declare_param?
And optional enum names with load_int?
Or add a separate load_enum call?

Really, I understand that the current interface can be pressed to do
pretty adequate things. I can infer field references. The way enums work
isn't too bad.

 - Jay 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20120821/42a2ebd1/attachment-0002.html>