[M3devel] why this loop/multipass stuff
Hendrik Boom
hendrik at topoi.pooq.com
Wed Aug 22 20:11:06 CEST 2012
On Wed, Aug 22, 2012 at 02:24:42PM +0000, Jay K wrote:
> why this loop/multipass stuff
>
>
> So people understand where I'm going here -- and Tony asked. I had already written this up..
>
>
>
> 1) in the past working on m3cc/parse.c, I found that "declare"s
> weren't always "topologically sorted".
>
>
> That is, like, I might see:
>
> given TYPE T1 = INTEGER;
> TYPE T2 = RECORD = a:T1; END;
>
>
> I might have seen type T1 declared before T2, or not.
> I'm not sure if it was record/fields or other relationships.
> I just remember that building up a map of typeid to data, and
> looking up it when needed, didn't always work, with one pass.
>
>
> So what I did is that I was willing to "skip" declares
> that referred to typeids I hadn't seen yet, and then loop
> again, hoping for progress at each loop (fewer skips).
Does that mean you have a table of typeid's and their meanings in RAM
during your multiple passes? If so, you might as well read them all in
at once, and then link them together in RAM afterward.
Or are you doing something sneakier?
Or are you so short on RAM that thie is impractical?
And for recorcds that are used by having pointers to them, you can just
emit 'struct foo;' near the beginning of the generated C file. The
topological sorting can be restricted to types that actually contain
others, rather then pointing to them. This would reduce the amount of
topological sorting you need to do.
In fact, if you have all your types in RAM, you can do the topological
sort *while* you are writing the types out. Just look through a type
before emitting it and recurively emit any component types first. Mark
each type when you emit it so you don't emit it twice.
>
>
> 2) It looks like declare_local/temp don't all come in
> "at the start" of a procedure. I'm not sure if I'm generating
> C or C++, but if I'm generating C, it helps to have declare_local/temp
> right after begin_procedure/block. Multiple passes let me make that
> so, without changing m3front.
> (Upon further looking..it seems this is mostly done right, but
> not for "try" blocks; maybe reasonable to change m3front here.).
How are you intending to make all these available for garbage
collection?
You might be interested in looking at Gambit/C, a Scheme compiler that
compiles Schene to C and seems to do a fairly good job of it.
>
>
> Now, generating C++ is an option, and I'm ignoring all declares
> currently anyway, so these don't force the matter, but:
>
>
> 3) nested functions.
> If a function contains any nested functions, I want to put all of
> its locals in a local struct, and use the address of that struct
> as the "static link". Or at least the locals that are referenced
> by nested functions. As well, m3front contains options as to
> the order of outputing functions. It uses one order for m3back and
> a different order for m3cc. I wouldn't mind if M3C.m3 worked either
> way.
I wanted to do this in my Algol 68 compiler when I was still considering
using the LLVM API, but placing temporaries and the like into the struct
too so as to make it easier to form a struct descriptor for it for the
garbage collector to ues. But that meant I'd be adding fields to it
while I was gernerating the code that used it, and LLVM objected to it.
Ecen though its whole API is oriented to putting pieces of the parse
tree in piecemeal, jumping all arounf the thing if necessary, it still
demanded that types be completely defined before you could use them for
anything much. In particular, they had to be completely defined before
you could use the API to generate parse tree that used the field names
of the structure.
This pretty well made the LLVM API significantly less attractive for
this project.
>
>
> There are possibly other benefits that could be derived from multiple passes.
> Nothing offhand though.
>
>
> - Jay
>
More information about the M3devel
mailing list