[M3devel] explanation of CheckLoadTracedRef?

Tony Hosking hosking at cs.purdue.edu
Wed Apr 22 02:02:52 CEST 2009


On 21 Apr 2009, at 19:37, Jay wrote:

> > > How bad/unportable was it the previous way, the VM-synchronized  
> way?
> >
> > Not compatible with system threading.
>
>
> Really? NT386 wasn't VM-synchronized back in 3.6, 4.1, etc.?

It was, but every system call had to be wrapped with a call to acquire  
the global heap lock!

> Or only with great cost?

Yes, taking the lock was expensive and prevented scaling on multi-cores.

> I have to admit, those old import .libs, kernel32.lib, etc. I didn't  
> realize what was in them when I deleted them. I thought they were  
> just regular import .libs.

There was also the work to wrap all dll symbols to acquire the lock.

> I think I got luckly in that the overlap between me deleting them  
> and you removing VM-synchronized GC was small or zero.

You should have seen the mess it was before...  ;-)

> > You should *never* access a field in the heap in C code! All
> > accesses to traced fields in the heap must take place in Modula-3.
> > Otherwise things will break! C wrappers should not do anything other
> > than forward calls to C library calls. They should not perform heap
> > accesses.
>
>
> Ok, that makes sense.
> Important "out" is the accessing stack is always ok.

Yes, so long as a references to an object is held on the stack then it  
is safe to pass an address within it to external calls.  Thus, many C  
functions can take VAR arguments that may end being references to the  
fields of objects.  The compiler injects the necessary CheckLoad/ 
CheckStore operations when passing  VAR parameters, etc., and the GC  
maintains the invariant that all stack-referenced objects don't move,  
stay black (for incremental GC), and remain dirty (for generational GC).

> But this is a requirement I didn't keep in mind.
> Now, luckily, the C wrappers are all relatively thin and not difficult
> to re-review in their entirety.

Right.  I think I did see you compute and pass an offset to C code,  
but that may have only been in code you e-mailed me for perusal rather  
than code that got checked in.  Might be worth reviewing...

> But, take for example "open".
> The first parameter to it is bound to be in the heap.

The ambiguous roots garbage collector we use maintains the invariant  
that pointers to the interior of heap objects from the stack *pin*  
that object in the heap: it will not move while the pointer from the  
stack exists, and invariants will be maintained so that its contenst  
can be manipulated safely even in the face of incremental and  
generational GC.

> But probably it is untraced or somehow ok, since it does
> come from a module used primarily for C interop.

Certainly, C code should never be manipulating the *traced* fields of  
traced heap objects.  It is fine for it to manipulate the untraced  
fields of traced heap objects.

> And, the line between C wrappers and the "C library" that they  
> forward to
> does not exist.
> If I, say, pass a VAR to a VAR..no check is made?

Not sure what you mean by this.  Any call that passes traced VAR  
params will generate a check as necessary before the call.

> Important to declare extern/C functions as taking UNTRACED REF and  
> not VAR?

No, VAR is fine.  So long as the VAR value being passed is not traced.

> > I think you are confusing incremental and incremental GC.
> You assume I understand more than I do (I assume you have a typo. :))

;-)

> "generational" -- the concept that most objects die young.
>   (aka most objects could have been allocated on the stack...)

Not quite.  The idea is that the likelihood of an object dying is a  
function of its age.  There is a *weak* generational hypothesis that  
"most objects die young", and a *strong* generational hypothesis that  
"the older an object is the less likely it is to die".  Many (but not  
all) programs support these hypotheses, which permits generational GC  
to focus effort where it is likely to be profitable (i.e., to free up  
a lot of space).

> But does that imply detailed implementation choices, or is just like  
> a "guiding principle"?
> I guess it implies the heap is split into at least two generations,  
> old and young.
> Though I guess in reality there is a range of young, less young,  
> lesser young, least young, etc.

Right, many different collectors have exploited age in this way.  For  
the M3 collector we have just two generations: old and young.

> And that has objects age, they should be moved from young to old  
> heaps, and references to them either updated right away, or "caught"  
> upon use and updated then...something like that.

Old-space objects are "clean" if they contain no references to young- 
space objects.  The Modula-3 compiler injects checks to make sure that  
whenever we store a reference into a clean old-space object it is  
marked "dirty".  When a young-space collection occurs we must process  
the references in dirty old-space objects as roots into the young-space.

> "incremental" -- don't pause the world..

Not quite.  The opposite of stop-the-world (stopping all the mutator  
threads to process their stacks) is on-the-fly.  Incremental refers to  
the ability to interleave GC work with mutator work.  If the GC work  
can be interleaved with mutator threads without stopping the mutator  
threads at each increment then the collector is said to be concurrent  
(GC work proceeds concurrently with mutator work).
The current M3 collector has a stop-the-world non-moving phase,  
followed by an concurrent copying phase.  I have some incomplete work  
that will also make the M3 collector on-the-fly (no STW phase) and  
parallel (multiple GC threads can operate concurrently).

Hope all this helps!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20090422/0bc36c95/attachment-0002.html>


More information about the M3devel mailing list