<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Verdana
}
</style>
</head>
<body class='hmmessage'>
understood, ok.<BR>
<BR>
I don't know the algorithm for allocating pages.<BR>
<BR>
There is a "hint" parameter to mmap, it isn't exactly clear how it is handled.<BR>
Probably giving the block we have plus its size is reasonable.<BR>
I can try that out at some point, though using sbrk seems ok.<BR>
<BR>
- Jay<BR><BR><BR>
<HR id=stopSpelling>
<BR>
CC: m3devel@elegosoft.com<BR>From: hosking@cs.purdue.edu<BR>To: jay.krell@cornell.edu<BR>Subject: Re: gc vs. large spans of memory<BR>Date: Mon, 17 Nov 2008 06:21:26 -0600<BR><BR><BR>
<DIV><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV style="WORD-WRAP: break-word"><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV>On 17 Nov 2008, at 03:35, Jay wrote:</DIV></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></DIV></SPAN></DIV>
<DIV><BR class=EC_Apple-interchange-newline>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">1meg pages weren't large enough to avoid running out of memory.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Is this still the sidespan array problem?</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">Any thoughts on using the boehm-gc?</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Hell no.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">I don't know if it is compacting.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Indeed, Boehm's collector does not do compaction. Also, Boehm's is fully conservative ("ambiguous-roots + ambiguous heap"), though it is possible to configure it to use precise heap information (which CM3 does provide).</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">Or even generational.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>It is generational, and has a parallel collection mode, but it is not concurrent like ours (we have a better chance of scaling on SMPs). Our collector will soon also be "on-the-fly", which means we won't need to have a stop-the-world phase where all the threads are stopped at the same time to initiate GC. Instead, we will simply signal threads one at a time to prepare for GC.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">Or anything about it really, just that it is much ported and presumably well maintained, since it is part of "gcc", part of the gcc Java support.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Our collector is much ported and well-maintained too! ;-) Seriously, I am quite sure we have no collector bugs currently (it has been running reliably for many years now). The problems you are encountering are porting issues and not bugs in the collector. Also, the current collector is written in Modula-3, and nicely integrated with the Modula-3 object model and run-time. There is a *huge* amount to be said for eating your own dog-food by writing the collector in Modula-3. There is *no* good reason to step outside Modula-3 to C as would be needed for Boehm GC.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">Hm... I assume the right data structure here is a multi level array, that you index by picking off progressively less significant bits of an address (or page number) just like the hardware page tables for virtual memory (but without the perf benefits of a cache and using physical addresses after the first level).<BR>When an entire level is at the same state, due to skipping a large span, you just have one shared entry for all of them, major savings.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Right. I have this partially implemented already.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">Either that, or a binary searched array where you only store entries for state changes.<BR>You'd use like the STL's upper_bound/lower_bound/equal_range functions, that is, really the same thing as bsearch, but when an entry isn't found, you return where it would be inserted, which is like the last place you looked before giving up.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>I'm concerned about the lookup costs for this.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">It might be worth confirming "that I'm not crazy" -- that mmap behaves this way for other people, or that skimming its code suggests it is not surprising. Could be related to how malloc works also.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Are there hints that can be passed to mmap on that platform to cause less scattered mappings? It is odd that sbrk is less scattered, since it must also bottom out at mmap, but perhaps it is trying to maintain a reasonably compact allocation below the "brk".</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana"><BR> <BR> - Jay<BR><BR>
<HR id=EC_stopSpelling>
<BR>From:<SPAN class=EC_Apple-converted-space> </SPAN><A href="mailto:hosking@cs.purdue.edu">hosking@cs.purdue.edu</A><BR>To:<SPAN class=EC_Apple-converted-space> </SPAN><A href="mailto:jay.krell@cornell.edu">jay.krell@cornell.edu</A><BR>Date: Sun, 16 Nov 2008 20:14:55 -0600<BR>CC:<SPAN class=EC_Apple-converted-space> </SPAN><A href="mailto:m3devel@elegosoft.com">m3devel@elegosoft.com</A><BR>Subject: Re: [M3devel] duplicated code?<BR><BR><BR>
<DIV><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV style="WORD-WRAP: break-word"><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate"><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV>On 16 Nov 2008, at 10:18, Jay wrote:</DIV></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></DIV></SPAN></DIV>
<DIV><BR class=EC_EC_Apple-interchange-newline>
<BLOCKQUOTE><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">Er, the large allocation does come from the gc itself.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>What's being allocated? The side array should not be too huge.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">Tony, is there is an assumption that the heap is contiguous?<BR>That calls to RTOS.GetMemory return adjacent addresses?</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>No assumption.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">This code allocates a large array:<BR> <BR> IF desc = NIL OR newSideSpan # NUMBER(desc^) THEN<BR> WITH newDesc = NEW(UNTRACED REF ARRAY OF Desc, newSideSpan) DO<BR><BR>I'll know more shortly.<BR> <BR>I guess..it looks like the heap can be discontiguous, but<BR>we do record keeping for what it all spans.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Correct.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">The comments say:<BR>(* The array desc and the global variables p0, and p1 describe the pages<BR> that are part of the traced heap. Either p0 and p1 are equal to Nil and<BR> no pages are allocated; or both are valid pages and page p is allocated<BR> iff<BR>| p0 <= p < p1<BR>| AND desc[p - p0] != Unallocated<BR><BR> <BR>Hm..<BR> <BR>Grow (0x44000) => 0x2b1e45256000 total: 1.5M<BR>GetUntracedOpenArray(0x1a80)<BR> span: 6.6M density: 24%<BR>stubgen: Processing RemoteView.T<BR>GetUntracedOpenArray(0x3f0)<BR>t1:0xc<BR>t2:0xa<BR>t3:0x1<BR>Grow (0x52000) => 0x2aaaaaaab000 total: 1.8M<BR>GetUntracedOpenArray(0x1ce69fc8)<BR> <BR>I have GetUntracedOpenArray<STRONG><SPAN class=EC_EC_Apple-converted-space> </SPAN></STRONG>printing how many bytes it is asked for.<BR>t1,t2,t3 are just the lengths of the strings being concatented.<BR>Grow(x)=>y means Grow allocated x bytes at address y.<BR> <BR>So now, these two addresses 0x2aaaaaaab000 and 0x2b1e45256000 are very far apart, like 400gig.<BR>And it seems the heap allocator wants to allocate an array to describe the pages.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Ah, yes, that is most unfortunate.</DIV>
<DIV><BR></DIV><BR>
<BLOCKQUOTE><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">Hm. Page size is no longer tied to the underlying system -- no longer vm-tied gc.<BR>Perhaps perhaps blowing it up, to say, 1meg, will address this?</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>I am working on minimizing the need for the global array, but we do need something that can be easily indexed like this. Perhaps we need to pass a hint to RTOS.GetMemory that will try to allocate its regions close together.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana">But really, an array to describe pages spanning the results of separate memory allocations, seems wrong.<BR>A sparser data structure would be good, that could describe arbitrary sized runs of pages as being in the same state.</DIV></SPAN></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>Indeed. As mentioned above, I am working to eliminate the need for this. Code that starts us on this path will be checked in within a day or so.</DIV><BR>
<BLOCKQUOTE><SPAN class=EC_EC_Apple-style-span style="WORD-SPACING: 0px; FONT: 12px Helvetica; TEXT-TRANSFORM: none; COLOR: rgb(0,0,0); TEXT-INDENT: 0px; WHITE-SPACE: normal; LETTER-SPACING: normal; BORDER-COLLAPSE: separate">
<DIV class=EC_EC_hmmessage style="FONT-SIZE: 10pt; FONT-FAMILY: Verdana"> - Jay<BR><BR><BR>
<HR id=EC_EC_stopSpelling>
<BR>From:<SPAN class=EC_EC_Apple-converted-space> </SPAN><A href="mailto:jay.krell@cornell.edu">jay.krell@cornell.edu</A><BR>To:<SPAN class=EC_EC_Apple-converted-space> </SPAN><A href="mailto:m3devel@elegosoft.com">m3devel@elegosoft.com</A><BR>Date: Sun, 16 Nov 2008 15:31:25 +0000<BR>Subject: [M3devel] duplicated code?<BR><BR>Anyone want to clean up this duplicity?<BR> <BR> <BR>D:\dev2\cm3.2>dir /s/b asttotype.m3 StubCode.m3<BR>D:\dev2\cm3.2\m3-comm\sharedobjgen\src\AstToType.m3<BR>D:\dev2\cm3.2\m3-comm\stubgen\src\AstToType.m3<BR>D:\dev2\cm3.2\m3-db\stablegen\src\AstToType.m3<BR>D:\dev2\cm3.2\m3-comm\sharedobjgen\src\StubCode.m3<BR>D:\dev2\cm3.2\m3-comm\stubgen\src\StubCode.m3<BR><BR> <BR>somewhere in there, AMD64_LINUX tries to allocate a lot of memory, and fails<BR>either there or soon thereafter.<BR>The garbage collector is working.<BR> <BR> - Jay<BR><BR><BR><BR><BR><BR><BR></DIV></SPAN></BLOCKQUOTE></DIV><BR></DIV></SPAN><BR class=EC_Apple-interchange-newline></BLOCKQUOTE></DIV><BR></body>
</html>