[M3devel] backend interface vs. types vs. forward references?

Mon Oct 4 19:49:51 CEST 2010

I see no problem walking the IR trees again to fill in the type information.  Can't you make it part of the lowering pass to GIMPLE?

Antony Hosking | Associate Professor | Computer Science | Purdue University
305 N. University Street | West Lafayette | IN 47907 | USA
Office +1 765 494 6001 | Mobile +1 765 427 5484

On 4 Oct 2010, at 13:23, Jay K wrote:

> 
> ps: gcc has a very large number of passes over its trees, at least when optimizing.
> Like tens or 100+.
> The Modula-3 frontend also makes a few passes over everything, just a few.
> I don't know where the cost is, but I don't expect to add much. We'll see.
> I can try to limit it to not even walk the non-type data.
> I should see if the frontend reliably front-loads the type data. It seems to.
> We could also put in an end-types opcode to  make it easier to notice.
> I think we could also address it in the frontend, by introducing
> a type forward declaration call.
> 
> 
>>> How big are the intermediate files for all our own sources?
> 
> 
> A few months ago I took a quick survey.
> This is when I grew the buffer fromwhateve it was to 64K.
> I couldn't justify larger because so many would fit in 64K.
> 
> I lied somewhat about working set.
> If you use a small buffer and iterate in place, your working
> set can only grow by the size of the buffer.
> If you read the entire thing into memory and walk it linearly,
> well, the operating system doesn't necessarily know you won't
> walk backwards so it'll let your working set grow, only to throw
> out the memory later as needed. This is my rough understanding
> based on OS principles. As well, using more address space
> has a little extra cost, vs. looping over a small buffer multiple times.
> 
> We might might might be able to make some optimizations though,
> such as having strings be direct pointers into the buffer instead
> of copying them out. There is the matter of the terminal nuls though.
> And checking that they fit in the buffer.
> 
>>> Can you write out some statistics?
> 
> 
> Yeah..
> I routinely use cm3 -keep, then just -l target/*c
> 
> 
>  - Jay
> 
> ----------------------------------------
>> From: jay.krell at cornell.edu
>> To: wagner at elegosoft.com; m3devel at elegosoft.com
>> Date: Mon, 4 Oct 2010 17:03:08 +0000
>> Subject: Re: [M3devel] backend interface vs. types vs. forward references?
>> 
>> 
>> The passes I'm talking about I think will be fast.
>> True the backend is very slow but I don't think this will matter.
>> The earlier will passes will ignore most of the data.
>> The cost will only be in the extra but ignored serialization.
>> And even then, it might be better -- if the ordering is a certain way
>> and guaranteed, once it hits certain opcodes, it will know the types
>> are all done and start over, without walking each opcode one at a time.
>> 
>> I tried building m3cc on virtual machines with only 256MB and it failed.
>> I had to up to 384 MB. If I cal recall correctly.
>> 
>> Granted, we don't always build m3cc.
>> 
>> Remember that optimized builds would often use "unit at once"
>> compilation, so the entire gcc tree would be in memory.
>> Now, currently, we never do that for Modula-3, because of a bug
>> where it throws out functions that are needed to be kept.
>> But for C/C++ it is not unusual (again, including compiling m3cc).
>> The tree representation is presumably not much different/smaller than
>> the m3cg representation. For actual C/C++ there might be a bigger difference,
>> what with comments/whitespace removed.
>> But from the gcc point of view, Modula-3 source is already in an encoded binary form.
>> Granted, the strings are duplicatd.
>> 
>> Still, the access pattern remains linear.
>> So it doesn't increase working set. Just virtual address space requiremens.
>> 
>> This is something I learned reeently working with large data -- linear access
>> patterns are what is good and keeps working set down, vs. random access.
>> 
>> Plus, the file gets closed which does free a little of resources, though
>> probably less than are being additional consumed.
>> 
>> - Jay
>> 
>> ----------------------------------------
>>> Date: Mon, 4 Oct 2010 16:45:46 +0200
>>> From: wagner at elegosoft.com
>>> To: m3devel at elegosoft.com
>>> Subject: Re: [M3devel] backend interface vs. types vs. forward references?
>>> 
>>> Quoting Jay K :
>>> 
>>>> I think I'll just solve this in the backend by making a few passes.
>>>> Maybe something with specific passes where early passes only pay
>>>> attention to certain opcodes, that declare types.
>>> 
>>> I'm not really happy with multiple passes within the backend just to
>>> make gcc happy. The performance of the gcc backend is already poor
>>> compared to an integrated backend and to what M3 should be able to
>>> achieve. How much will it cost wrt. performance?
>>> 
>>>> The new/current "replay" stuff will maybe go away.
>>> 
>>> Hm, I must have missed that.
>>> 
>>>> The new/current keeping of the entire file in memory will stay
>>>> unless someone has strong evidence/argument that is shouldn't.
>>> 
>>> Keeping the whole (intermediate code) file in memory should be fine,
>>> unless we get problems for large generated files on small machines
>>> somewhere.
>>> 
>>> How big are the intermediate files for all our own sources?
>>> Can you write out some statistics?
>>> 
>>> Olaf
>>> --
>>> Olaf Wagner -- elego Software Solutions GmbH
>>> Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
>>> phone: +49 30 23 45 86 96 mobile: +49 177 2345 869 fax: +49 30 23 45 86 95
>>> http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
>>> Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194
>>> 
>> 
> 		 	   		  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20101004/e52c42eb/attachment-0002.html>