[M3devel] backend interface vs. types vs. forward references?

Mon Oct 4 22:41:35 CEST 2010

I would avoid a new m3cg.  Just walk the tree as you say.

On 4 Oct 2010, at 14:05, Jay K wrote:

> 
> I think I understand and I think so.
> Rather than lowing to gimple though, it could just be before all that, right?
> I mean, walking the gcc trees and converting from "pre-gimple" to "pre-gimple", right?
> i.e. not lowering, just transform within same "space"?
> 
> What I was talking about though was walking the m3cg opcode IR.
> At least the initial type declaration part.
> 
> A new m3cg call to just forward declare types might be good too, instead.
> 
>  - Jay
> 
> ________________________________
>> From: hosking at cs.purdue.edu
>> Date: Mon, 4 Oct 2010 13:49:51 -0400
>> To: jay.krell at cornell.edu
>> CC: m3devel at elegosoft.com
>> Subject: Re: [M3devel] backend interface vs. types vs. forward references?
>> 
>> I see no problem walking the IR trees again to fill in the type
>> information. Can't you make it part of the lowering pass to GIMPLE?
>> 
>> 
>> Antony Hosking | Associate Professor | Computer Science | Purdue University
>> 305 N. University Street | West Lafayette | IN 47907 | USA
>> Office +1 765 494 6001 | Mobile +1 765 427 5484
>> 
>> 
>> 
>> 
>> On 4 Oct 2010, at 13:23, Jay K wrote:
>> 
>> 
>> ps: gcc has a very large number of passes over its trees, at least when
>> optimizing.
>> Like tens or 100+.
>> The Modula-3 frontend also makes a few passes over everything, just a few.
>> I don't know where the cost is, but I don't expect to add much. We'll see.
>> I can try to limit it to not even walk the non-type data.
>> I should see if the frontend reliably front-loads the type data. It seems to.
>> We could also put in an end-types opcode to make it easier to notice.
>> I think we could also address it in the frontend, by introducing
>> a type forward declaration call.
>> 
>> 
>> How big are the intermediate files for all our own sources?
>> 
>> 
>> A few months ago I took a quick survey.
>> This is when I grew the buffer fromwhateve it was to 64K.
>> I couldn't justify larger because so many would fit in 64K.
>> 
>> I lied somewhat about working set.
>> If you use a small buffer and iterate in place, your working
>> set can only grow by the size of the buffer.
>> If you read the entire thing into memory and walk it linearly,
>> well, the operating system doesn't necessarily know you won't
>> walk backwards so it'll let your working set grow, only to throw
>> out the memory later as needed. This is my rough understanding
>> based on OS principles. As well, using more address space
>> has a little extra cost, vs. looping over a small buffer multiple times.
>> 
>> We might might might be able to make some optimizations though,
>> such as having strings be direct pointers into the buffer instead
>> of copying them out. There is the matter of the terminal nuls though.
>> And checking that they fit in the buffer.
>> 
>> Can you write out some statistics?
>> 
>> 
>> Yeah..
>> I routinely use cm3 -keep, then just -l target/*c
>> 
>> 
>> - Jay
>> 
>> ----------------------------------------
>> From: jay.krell at cornell.edu
>> To: wagner at elegosoft.com;
>> m3devel at elegosoft.com
>> Date: Mon, 4 Oct 2010 17:03:08 +0000
>> Subject: Re: [M3devel] backend interface vs. types vs. forward references?
>> 
>> 
>> The passes I'm talking about I think will be fast.
>> True the backend is very slow but I don't think this will matter.
>> The earlier will passes will ignore most of the data.
>> The cost will only be in the extra but ignored serialization.
>> And even then, it might be better -- if the ordering is a certain way
>> and guaranteed, once it hits certain opcodes, it will know the types
>> are all done and start over, without walking each opcode one at a time.
>> 
>> I tried building m3cc on virtual machines with only 256MB and it failed.
>> I had to up to 384 MB. If I cal recall correctly.
>> 
>> Granted, we don't always build m3cc.
>> 
>> Remember that optimized builds would often use "unit at once"
>> compilation, so the entire gcc tree would be in memory.
>> Now, currently, we never do that for Modula-3, because of a bug
>> where it throws out functions that are needed to be kept.
>> But for C/C++ it is not unusual (again, including compiling m3cc).
>> The tree representation is presumably not much different/smaller than
>> the m3cg representation. For actual C/C++ there might be a bigger difference,
>> what with comments/whitespace removed.
>> But from the gcc point of view, Modula-3 source is already in an
>> encoded binary form.
>> Granted, the strings are duplicatd.
>> 
>> Still, the access pattern remains linear.
>> So it doesn't increase working set. Just virtual address space requiremens.
>> 
>> This is something I learned reeently working with large data -- linear access
>> patterns are what is good and keeps working set down, vs. random access.
>> 
>> Plus, the file gets closed which does free a little of resources, though
>> probably less than are being additional consumed.
>> 
>> - Jay
>> 
>> ----------------------------------------
>> Date: Mon, 4 Oct 2010 16:45:46 +0200
>> From: wagner at elegosoft.com
>> To: m3devel at elegosoft.com
>> Subject: Re: [M3devel] backend interface vs. types vs. forward references?
>> 
>> Quoting Jay K :
>> 
>> I think I'll just solve this in the backend by making a few passes.
>> Maybe something with specific passes where early passes only pay
>> attention to certain opcodes, that declare types.
>> 
>> I'm not really happy with multiple passes within the backend just to
>> make gcc happy. The performance of the gcc backend is already poor
>> compared to an integrated backend and to what M3 should be able to
>> achieve. How much will it cost wrt. performance?
>> 
>> The new/current "replay" stuff will maybe go away.
>> 
>> Hm, I must have missed that.
>> 
>> The new/current keeping of the entire file in memory will stay
>> unless someone has strong evidence/argument that is shouldn't.
>> 
>> Keeping the whole (intermediate code) file in memory should be fine,
>> unless we get problems for large generated files on small machines
>> somewhere.
>> 
>> How big are the intermediate files for all our own sources?
>> Can you write out some statistics?
>> 
>> Olaf
>> --
>> Olaf Wagner -- elego Software Solutions GmbH
>> Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
>> phone: +49 30 23 45 86 96 mobile: +49 177 2345 869 fax: +49 30 23 45 86 95
>> http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
>> Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194
>> 
>> 
>> 
>> 
>