[M3devel] why this loop/multipass stuff

Hendrik Boom hendrik at topoi.pooq.com
Wed Aug 22 20:11:06 CEST 2012


On Wed, Aug 22, 2012 at 02:24:42PM +0000, Jay K wrote:
> why this loop/multipass stuff
> 
> 
> So people understand where I'm going here -- and Tony asked. I had already written this up..
> 
> 
> 
> 1) in the past working on m3cc/parse.c, I found that "declare"s
> weren't always "topologically sorted".
> 
> 
> That is, like, I might see:
> 
> given TYPE T1 = INTEGER;
> TYPE T2 = RECORD = a:T1; END;
> 
> 
> I might have seen type T1 declared before T2, or not.
> I'm not sure if it was record/fields or other relationships.
> I just remember that building up a map of typeid to data, and
> looking up it when needed, didn't always work, with one pass.
> 
> 
> So what I did is that I was willing to "skip" declares
> that referred to typeids I hadn't seen yet, and then loop
> again, hoping for progress at each loop (fewer skips).

Does that mean you have a table of typeid's and their meanings in RAM 
during your multiple passes? If so, you might as well read them all in 
at once, and then link them together in RAM afterward.

Or are you doing something sneakier?

Or are you so short on RAM that thie is impractical?

And for recorcds that are used by having pointers to them, you can just 
emit 'struct foo;' near the beginning of the generated C file.  The 
topological sorting can be restricted to types that actually contain 
others, rather then pointing to them.  This would reduce the amount of 
topological sorting you need to do.

In fact, if you have all your types in RAM, you can do the topological 
sort *while* you are writing the types out.  Just look through a type 
before emitting it and recurively emit any component types first.  Mark 
each type when you emit it so you don't emit it twice.

> 
> 
> 2) It looks like declare_local/temp don't all come in
> "at the start" of a procedure. I'm not sure if I'm generating
> C or C++, but if I'm generating C, it helps to have declare_local/temp
> right after begin_procedure/block. Multiple passes let me make that
> so, without changing m3front.
> (Upon further looking..it seems this is mostly done right, but
> not for "try" blocks; maybe reasonable to change m3front here.).

How are you intending to make all these available for garbage 
collection?

You might be interested in looking at Gambit/C, a Scheme compiler that 
compiles Schene to C and seems to do a fairly good job of it.

> 
> 
> Now, generating C++ is an option, and I'm ignoring all declares
> currently anyway, so these don't force the matter, but:
> 
> 
> 3) nested functions.
> If a function contains any nested functions, I want to put all of
> its locals in a local struct, and use the address of that struct
> as the "static link". Or at least the locals that are referenced
> by nested functions. As well, m3front contains options as to
> the order of outputing functions. It uses one order for m3back and
> a different order for m3cc. I wouldn't mind if M3C.m3 worked either
> way.

I wanted to do this in my Algol 68 compiler when I was still considering 
using the LLVM API, but placing temporaries and the like into the struct 
too so as to make it easier to form a struct descriptor for it for the 
garbage collector to ues.  But that meant I'd be adding fields to it 
while I was gernerating the code that used it, and LLVM objected to it.  
Ecen though its whole API is oriented to putting pieces of the parse 
tree in piecemeal, jumping all arounf the thing if necessary, it still 
demanded that types be completely defined before you could use them for 
anything much.  In particular, they had to be completely defined before 
you could use the API to generate parse tree that used the field names 
of the structure.

This pretty well made the LLVM API significantly less attractive for 
this project.

> 
> 
> There are possibly other benefits that could be derived from multiple passes.
> Nothing offhand though.
> 
> 
>  - Jay
>  		 	   		  



More information about the M3devel mailing list