[M3devel] gc vs. large spans of memory

Jay jay.krell at cornell.edu
Mon Nov 17 13:48:59 CET 2008


understood, ok.
 
I don't know the algorithm for allocating pages.
 
There is a "hint" parameter to mmap, it isn't exactly clear how it is handled.
Probably giving the block we have plus its size is reasonable.
I can try that out at some point, though using sbrk seems ok.
 
 - Jay



CC: m3devel at elegosoft.comFrom: hosking at cs.purdue.eduTo: jay.krell at cornell.eduSubject: Re: gc vs. large spans of memoryDate: Mon, 17 Nov 2008 06:21:26 -0600


On 17 Nov 2008, at 03:35, Jay wrote:


1meg pages weren't large enough to avoid running out of memory.

Is this still the sidespan array problem?

Any thoughts on using the boehm-gc?

Hell no.

I don't know if it is compacting.

Indeed, Boehm's collector does not do compaction.  Also, Boehm's is fully conservative ("ambiguous-roots + ambiguous heap"), though it is possible to configure it to use precise heap information (which CM3 does provide).

Or even generational.

It is generational, and has a parallel collection mode, but it is not concurrent like ours (we have a better chance of scaling on SMPs).  Our collector will soon also be "on-the-fly", which means we won't need to have a stop-the-world phase where all the threads are stopped at the same time to initiate GC.  Instead, we will simply signal threads one at a time to prepare for GC.

Or anything about it really, just that it is much ported and presumably well maintained, since it is part of "gcc", part of the gcc Java support.

Our collector is much ported and well-maintained too! ;-)  Seriously, I am quite sure we have no collector bugs currently (it has been running reliably for many years now).  The problems you are encountering are porting issues and not bugs in the collector.  Also, the current collector is written in Modula-3, and nicely integrated with the Modula-3 object model and run-time.   There is a *huge* amount to be said for eating your own dog-food by writing the collector in Modula-3.  There is *no* good reason to step outside Modula-3 to C as would be needed for Boehm GC.

Hm... I assume the right data structure here is a multi level array, that you index by picking off progressively less significant bits of an address (or page number) just like the hardware page tables for virtual memory (but without the perf benefits of a cache and using physical addresses after the first level).When an entire level is at the same state, due to skipping a large span, you just have one shared entry for all of them, major savings.

Right.  I have this partially implemented already.

Either that, or a binary searched array where you only store entries for state changes.You'd use like the STL's upper_bound/lower_bound/equal_range functions, that is, really the same thing as bsearch, but when an entry isn't found, you return where it would be inserted, which is like the last place you looked before giving up.

I'm concerned about the lookup costs for this.

It might be worth confirming "that I'm not crazy" -- that mmap behaves this way for other people, or that skimming its code suggests it is not surprising. Could be related to how malloc works also.

Are there hints that can be passed to mmap on that platform to cause less scattered mappings?  It is odd that sbrk is less scattered, since it must also bottom out at mmap, but perhaps it is trying to maintain a reasonably compact allocation below the "brk".

  - Jay

From: hosking at cs.purdue.eduTo: jay.krell at cornell.eduDate: Sun, 16 Nov 2008 20:14:55 -0600CC: m3devel at elegosoft.comSubject: Re: [M3devel] duplicated code?


On 16 Nov 2008, at 10:18, Jay wrote:


Er, the large allocation does come from the gc itself.

What's being allocated?  The side array should not be too huge.

Tony, is there is an assumption that the heap is contiguous?That calls to RTOS.GetMemory return adjacent addresses?

No assumption.

This code allocates a large array:     IF desc = NIL OR newSideSpan # NUMBER(desc^) THEN      WITH newDesc = NEW(UNTRACED REF ARRAY OF Desc, newSideSpan) DOI'll know more shortly. I guess..it looks like the heap can be discontiguous, butwe do record keeping for what it all spans.

Correct.

The comments say:(* The array desc and the global variables p0, and p1 describe the pages   that are part of the traced heap.  Either p0 and p1 are equal to Nil and   no pages are allocated; or both are valid pages and page p is allocated   iff|          p0 <= p < p1|      AND desc[p - p0] != Unallocated Hm.. Grow (0x44000) => 0x2b1e45256000   total: 1.5MGetUntracedOpenArray(0x1a80)   span: 6.6M   density: 24%stubgen: Processing RemoteView.TGetUntracedOpenArray(0x3f0)t1:0xct2:0xat3:0x1Grow (0x52000) => 0x2aaaaaaab000   total: 1.8MGetUntracedOpenArray(0x1ce69fc8) I have GetUntracedOpenArray printing how many bytes it is asked for.t1,t2,t3 are just the lengths of the strings being concatented.Grow(x)=>y means Grow allocated x bytes at address y. So now, these two addresses 0x2aaaaaaab000 and 0x2b1e45256000  are very far apart, like 400gig.And it seems the heap allocator wants to allocate an array to describe the pages.

Ah, yes, that is most unfortunate.


Hm. Page size is no longer tied to the underlying system -- no longer vm-tied gc.Perhaps perhaps blowing it up, to say, 1meg, will address this?

I am working on minimizing the need for the global array, but we do need something that can be easily indexed like this.  Perhaps we need to pass a hint to RTOS.GetMemory that will try to allocate its regions close together.

But really, an array to describe pages spanning the results of separate memory allocations, seems wrong.A sparser data structure would be good, that could describe arbitrary sized runs of pages as being in the same state.

Indeed.  As mentioned above, I am working to eliminate the need for this.  Code that starts us on this path will be checked in within a day or so.

 - Jay

From: jay.krell at cornell.eduTo: m3devel at elegosoft.comDate: Sun, 16 Nov 2008 15:31:25 +0000Subject: [M3devel] duplicated code?Anyone want to clean up this duplicity?  D:\dev2\cm3.2>dir /s/b asttotype.m3  StubCode.m3D:\dev2\cm3.2\m3-comm\sharedobjgen\src\AstToType.m3D:\dev2\cm3.2\m3-comm\stubgen\src\AstToType.m3D:\dev2\cm3.2\m3-db\stablegen\src\AstToType.m3D:\dev2\cm3.2\m3-comm\sharedobjgen\src\StubCode.m3D:\dev2\cm3.2\m3-comm\stubgen\src\StubCode.m3 somewhere in there, AMD64_LINUX tries to allocate a lot of memory, and failseither there or soon thereafter.The garbage collector is working.  - Jay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20081117/4253a60f/attachment-0002.html>


More information about the M3devel mailing list