[M3devel] changing m3middle/m3back to have multiple passes?

Jay K jay.krell at cornell.edu
Thu Apr 1 17:10:04 CEST 2010


Simple and fast does not entirely preclude multiple passes, and one pass does not equate to simple, though yes probably fast.

The front end already makes multiple passes I believe, and is still fast.

 

 

Sometimes decomposition into more smaller pieces (passes, layers, etc.) make things simpler.

  Albeit slower.

 

 

A more historical factor here is that multiple passes imply keeping more data in memory longer.

 Implies slower.

 However memory capacity has increased tremendously.

 

 

Whereas I'm only thinking about multiple passes, while still compiling one file at a time,

other compilation systems have capitalized on larger memories and now compile entire

programs at once, including multiple passes.

  This goes by multiple names -- "whole program optimization", "link time codegen (LTCG)".

   And enables even more aggressive but still "fairly simple" optimization, such as cross

   module inlining, custom calling conventions. I say "fairly simple", because these

   are just the same optimizations one would perform within a module, merely applied

   across module boundaries.

 

 

(Sometimes what people do is #include all .c files into one master .c file, to help

get the compiler to see everything at once. sqllite does this, for example.)

 

 

Surely we can afford multiple passes on a per-file basis.

?

 

 

The code is already not that easy to understand imho.

  It took me quite some time to get comfortable and add the 64 bit integer support, years.

  And there's a few things I still don't understand (you can tell from my comments

  in the code, like where do I need to call "newdest")?

 

 

Embrace it and make it incrementally harder to understand (by virtue of simply adding to it),

or decide it is stuck asis?

 

 

I'm not entirely sure.

 

 

While it is an interesting compromise in its current form, I'm fairly certain

there are other viable compromises that achieve better code quality, are

still plenty fast, and aren't much more difficult to understand.

 

 

Heck, I found it confusing at first that the -debug output is wrong in some places.

That is a fallout from it being single pass "plus patching".

The debug output doesn't include the "patches".

 

 

I have some very simple small optimizations in mind.

 

 

And then, in stepping through code, and writing code, inlining constantly comes to mind.

In fact, I frequently see "manual partial inlining" in the Modula-3 code.

It'd be nifty if the Modula-3 code could have less duplication, relying on the compiler more.

 

 

I don't see inlining as entirely simple, but I've given it a little thought and it might not

 be so hard. I think it can be done, roughly, by taking the body of a procedure 

as expressed as cg.i3 calls, omitting the begin/procedure, and just "playing them back"

at the call site. You'd only do this for "small" functions. You'd have to make some

adjustments like merging local variables and exception handling, though really

the inlined function would just be a "block".

 

 

Probably the way to go here, if inlining is really to be considered, is first

implement it as "one pass plus some memory", and only inline when

the definition comes before the call. Once that is achieved, no small feat,

further considering can be made to allow for inlining when the call precedes the definition.

 

eh, sorry, I'm tried, rambling.

 

 

 - Jay
 
> Date: Thu, 1 Apr 2010 09:28:53 -0500
> From: rodney_bates at lcwb.coop
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] changing m3middle/m3back to have multiple passes?
> 
> 
> 
> Jay K wrote:
> > I'm thinking m3back could use multiple passes.
> 
> This would be a big reversal of the fundamental design philosophy of this
> back end. Its primary design criterion was to be simple and fast. That
> means single-pass. There have been several compilers (often whole compilers,
> not just code generators) written to this principle. For some purposes,
> it is a good thing.
> 
> This, of course, inevitably means the generated code is not as small/fast
> as it could be. It's part of the trade off. We have two points on this
> trade off scale now. I think we want to keep the one-pass code generator.
> If you change it, at least make a branch and keep the original design
> easily accessible.
> 
> 
> > 
> > 
> > There a few reasons, things that would be easier/possible if it had them.
> > 
> > 
> > I know adding passes will slow it down, but I suspect it will still be
> > about as pleasantly fast as it is now.
> > 
> > 
> > The optimizations I have in mind:
> > inlining, including where function definitions come after the call
> > I think inlining where the "small" functions come first is doable in 
> > pass,
> > but not if uses come before calls.
> > 
> > 
> > not saving unused registers
> > This only a small optimization and could probably be done in one
> > pass if m3objfile got a "memmove" operation. Well, I already
> > have the optimization in somewhat, but it involves nop-ing out
> > code instead of removing it completely.
> > 
> > 
> > dead store removal, and then removing the resulting dead code to 
> > compute the value
> > 
> > 
> > possibly better register allocation -- e.g. based on noticing which
> > variables are used often, or based on how they are used, for example
> > if a variable (or expression/temporary) ends serving a shift count,
> > we should favor putting it in ecx up front, instead of moving it to
> > ecx later, but we don't generally know this till too late
> > 
> > 
> > As I understand, m3middle already has the notion of "recording"
> > and "playing back" calls to the M3CG interface.
> > That seems like a good starting point?
> > 
> > 
> > Some optimizations can be made via "CG to CG" transforms.
> > 
> > 
> > However I think in general I think the required design would include:
> > - ability to add in "private operations"; maybe
> > - ability to add "private data" to existing operations
> > A very simple one is annotate functions with required frame size.
> > This isn't currently known until the end of the function.
> > - clearly, ability to remove operations
> > 
> > 
> > Can/should we somehow augment m3middle in support of this line of thinking?
> > 
> > 
> > Or is that just not the way to go?
> > Each backend should have its own internal representation and loop over it?
> > 
> > - Jay
> > 
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100401/97e910da/attachment-0002.html>


More information about the M3devel mailing list