[M3devel] gc vs. large spans of memory

Mon Nov 17 13:21:26 CET 2008

On 17 Nov 2008, at 03:35, Jay wrote:

> 1meg pages weren't large enough to avoid running out of memory.

Is this still the sidespan array problem?

> Any thoughts on using the boehm-gc?

Hell no.

> I don't know if it is compacting.

Indeed, Boehm's collector does not do compaction.  Also, Boehm's is  
fully conservative ("ambiguous-roots + ambiguous heap"), though it is  
possible to configure it to use precise heap information (which CM3  
does provide).

> Or even generational.

It is generational, and has a parallel collection mode, but it is not  
concurrent like ours (we have a better chance of scaling on SMPs).   
Our collector will soon also be "on-the-fly", which means we won't  
need to have a stop-the-world phase where all the threads are stopped  
at the same time to initiate GC.  Instead, we will simply signal  
threads one at a time to prepare for GC.

> Or anything about it really, just that it is much ported and  
> presumably well maintained, since it is part of "gcc", part of the  
> gcc Java support.

Our collector is much ported and well-maintained too! ;-)  Seriously,  
I am quite sure we have no collector bugs currently (it has been  
running reliably for many years now).  The problems you are  
encountering are porting issues and not bugs in the collector.  Also,  
the current collector is written in Modula-3, and nicely integrated  
with the Modula-3 object model and run-time.   There is a *huge*  
amount to be said for eating your own dog-food by writing the  
collector in Modula-3.  There is *no* good reason to step outside  
Modula-3 to C as would be needed for Boehm GC.

> Hm... I assume the right data structure here is a multi level array,  
> that you index by picking off progressively less significant bits of  
> an address (or page number) just like the hardware page tables for  
> virtual memory (but without the perf benefits of a cache and using  
> physical addresses after the first level).
> When an entire level is at the same state, due to skipping a large  
> span, you just have one shared entry for all of them, major savings.

Right.  I have this partially implemented already.

> Either that, or a binary searched array where you only store entries  
> for state changes.
> You'd use like the STL's upper_bound/lower_bound/equal_range  
> functions, that is, really the same thing as bsearch, but when an  
> entry isn't found, you return where it would be inserted, which is  
> like the last place you looked before giving up.

I'm concerned about the lookup costs for this.

> It might be worth confirming "that I'm not crazy" -- that mmap  
> behaves this way for other people, or that skimming its code  
> suggests it is not surprising. Could be related to how malloc works  
> also.

Are there hints that can be passed to mmap on that platform to cause  
less scattered mappings?  It is odd that sbrk is less scattered, since  
it must also bottom out at mmap, but perhaps it is trying to maintain  
a reasonably compact allocation below the "brk".

>
>
>  - Jay
>
>
> From: hosking at cs.purdue.edu
> To: jay.krell at cornell.edu
> Date: Sun, 16 Nov 2008 20:14:55 -0600
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] duplicated code?
>
>
> On 16 Nov 2008, at 10:18, Jay wrote:
>
> Er, the large allocation does come from the gc itself.
>
> What's being allocated?  The side array should not be too huge.
>
> Tony, is there is an assumption that the heap is contiguous?
> That calls to RTOS.GetMemory return adjacent addresses?
>
> No assumption.
>
> This code allocates a large array:
>
>     IF desc = NIL OR newSideSpan # NUMBER(desc^) THEN
>       WITH newDesc = NEW(UNTRACED REF ARRAY OF Desc, newSideSpan) DO
>
> I'll know more shortly.
>
> I guess..it looks like the heap can be discontiguous, but
> we do record keeping for what it all spans.
>
> Correct.
>
> The comments say:
> (* The array desc and the global variables p0, and p1 describe the  
> pages
>    that are part of the traced heap.  Either p0 and p1 are equal to  
> Nil and
>    no pages are allocated; or both are valid pages and page p is  
> allocated
>    iff
> |          p0 <= p < p1
> |      AND desc[p - p0] != Unallocated
>
>
> Hm..
>
> Grow (0x44000) => 0x2b1e45256000   total: 1.5M
> GetUntracedOpenArray(0x1a80)
>    span: 6.6M   density: 24%
> stubgen: Processing RemoteView.T
> GetUntracedOpenArray(0x3f0)
> t1:0xc
> t2:0xa
> t3:0x1
> Grow (0x52000) => 0x2aaaaaaab000   total: 1.8M
> GetUntracedOpenArray(0x1ce69fc8)
>
> I have GetUntracedOpenArray printing how many bytes it is asked for.
> t1,t2,t3 are just the lengths of the strings being concatented.
> Grow(x)=>y means Grow allocated x bytes at address y.
>
> So now, these two addresses 0x2aaaaaaab000 and 0x2b1e45256000  are  
> very far apart, like 400gig.
> And it seems the heap allocator wants to allocate an array to  
> describe the pages.
>
> Ah, yes, that is most unfortunate.
>
>
> Hm. Page size is no longer tied to the underlying system -- no  
> longer vm-tied gc.
> Perhaps perhaps blowing it up, to say, 1meg, will address this?
>
> I am working on minimizing the need for the global array, but we do  
> need something that can be easily indexed like this.  Perhaps we  
> need to pass a hint to RTOS.GetMemory that will try to allocate its  
> regions close together.
>
> But really, an array to describe pages spanning the results of  
> separate memory allocations, seems wrong.
> A sparser data structure would be good, that could describe  
> arbitrary sized runs of pages as being in the same state.
>
> Indeed.  As mentioned above, I am working to eliminate the need for  
> this.  Code that starts us on this path will be checked in within a  
> day or so.
>
>  - Jay
>
>
>
> From: jay.krell at cornell.edu
> To: m3devel at elegosoft.com
> Date: Sun, 16 Nov 2008 15:31:25 +0000
> Subject: [M3devel] duplicated code?
>
> Anyone want to clean up this duplicity?
>
>
> D:\dev2\cm3.2>dir /s/b asttotype.m3  StubCode.m3
> D:\dev2\cm3.2\m3-comm\sharedobjgen\src\AstToType.m3
> D:\dev2\cm3.2\m3-comm\stubgen\src\AstToType.m3
> D:\dev2\cm3.2\m3-db\stablegen\src\AstToType.m3
> D:\dev2\cm3.2\m3-comm\sharedobjgen\src\StubCode.m3
> D:\dev2\cm3.2\m3-comm\stubgen\src\StubCode.m3
>
>
> somewhere in there, AMD64_LINUX tries to allocate a lot of memory,  
> and fails
> either there or soon thereafter.
> The garbage collector is working.
>
>  - Jay
>
>
>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20081117/70cfa516/attachment-0002.html>