[M3devel] backend interface vs. types vs. forward references?
Tony Hosking
hosking at cs.purdue.edu
Mon Oct 4 19:49:51 CEST 2010
I see no problem walking the IR trees again to fill in the type information. Can't you make it part of the lowering pass to GIMPLE?
Antony Hosking | Associate Professor | Computer Science | Purdue University
305 N. University Street | West Lafayette | IN 47907 | USA
Office +1 765 494 6001 | Mobile +1 765 427 5484
On 4 Oct 2010, at 13:23, Jay K wrote:
>
> ps: gcc has a very large number of passes over its trees, at least when optimizing.
> Like tens or 100+.
> The Modula-3 frontend also makes a few passes over everything, just a few.
> I don't know where the cost is, but I don't expect to add much. We'll see.
> I can try to limit it to not even walk the non-type data.
> I should see if the frontend reliably front-loads the type data. It seems to.
> We could also put in an end-types opcode to make it easier to notice.
> I think we could also address it in the frontend, by introducing
> a type forward declaration call.
>
>
>>> How big are the intermediate files for all our own sources?
>
>
> A few months ago I took a quick survey.
> This is when I grew the buffer fromwhateve it was to 64K.
> I couldn't justify larger because so many would fit in 64K.
>
> I lied somewhat about working set.
> If you use a small buffer and iterate in place, your working
> set can only grow by the size of the buffer.
> If you read the entire thing into memory and walk it linearly,
> well, the operating system doesn't necessarily know you won't
> walk backwards so it'll let your working set grow, only to throw
> out the memory later as needed. This is my rough understanding
> based on OS principles. As well, using more address space
> has a little extra cost, vs. looping over a small buffer multiple times.
>
> We might might might be able to make some optimizations though,
> such as having strings be direct pointers into the buffer instead
> of copying them out. There is the matter of the terminal nuls though.
> And checking that they fit in the buffer.
>
>>> Can you write out some statistics?
>
>
> Yeah..
> I routinely use cm3 -keep, then just -l target/*c
>
>
> - Jay
>
> ----------------------------------------
>> From: jay.krell at cornell.edu
>> To: wagner at elegosoft.com; m3devel at elegosoft.com
>> Date: Mon, 4 Oct 2010 17:03:08 +0000
>> Subject: Re: [M3devel] backend interface vs. types vs. forward references?
>>
>>
>> The passes I'm talking about I think will be fast.
>> True the backend is very slow but I don't think this will matter.
>> The earlier will passes will ignore most of the data.
>> The cost will only be in the extra but ignored serialization.
>> And even then, it might be better -- if the ordering is a certain way
>> and guaranteed, once it hits certain opcodes, it will know the types
>> are all done and start over, without walking each opcode one at a time.
>>
>> I tried building m3cc on virtual machines with only 256MB and it failed.
>> I had to up to 384 MB. If I cal recall correctly.
>>
>> Granted, we don't always build m3cc.
>>
>> Remember that optimized builds would often use "unit at once"
>> compilation, so the entire gcc tree would be in memory.
>> Now, currently, we never do that for Modula-3, because of a bug
>> where it throws out functions that are needed to be kept.
>> But for C/C++ it is not unusual (again, including compiling m3cc).
>> The tree representation is presumably not much different/smaller than
>> the m3cg representation. For actual C/C++ there might be a bigger difference,
>> what with comments/whitespace removed.
>> But from the gcc point of view, Modula-3 source is already in an encoded binary form.
>> Granted, the strings are duplicatd.
>>
>> Still, the access pattern remains linear.
>> So it doesn't increase working set. Just virtual address space requiremens.
>>
>> This is something I learned reeently working with large data -- linear access
>> patterns are what is good and keeps working set down, vs. random access.
>>
>> Plus, the file gets closed which does free a little of resources, though
>> probably less than are being additional consumed.
>>
>> - Jay
>>
>> ----------------------------------------
>>> Date: Mon, 4 Oct 2010 16:45:46 +0200
>>> From: wagner at elegosoft.com
>>> To: m3devel at elegosoft.com
>>> Subject: Re: [M3devel] backend interface vs. types vs. forward references?
>>>
>>> Quoting Jay K :
>>>
>>>> I think I'll just solve this in the backend by making a few passes.
>>>> Maybe something with specific passes where early passes only pay
>>>> attention to certain opcodes, that declare types.
>>>
>>> I'm not really happy with multiple passes within the backend just to
>>> make gcc happy. The performance of the gcc backend is already poor
>>> compared to an integrated backend and to what M3 should be able to
>>> achieve. How much will it cost wrt. performance?
>>>
>>>> The new/current "replay" stuff will maybe go away.
>>>
>>> Hm, I must have missed that.
>>>
>>>> The new/current keeping of the entire file in memory will stay
>>>> unless someone has strong evidence/argument that is shouldn't.
>>>
>>> Keeping the whole (intermediate code) file in memory should be fine,
>>> unless we get problems for large generated files on small machines
>>> somewhere.
>>>
>>> How big are the intermediate files for all our own sources?
>>> Can you write out some statistics?
>>>
>>> Olaf
>>> --
>>> Olaf Wagner -- elego Software Solutions GmbH
>>> Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
>>> phone: +49 30 23 45 86 96 mobile: +49 177 2345 869 fax: +49 30 23 45 86 95
>>> http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
>>> Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20101004/e52c42eb/attachment-0002.html>
More information about the M3devel
mailing list