[M3devel] explanation of CheckLoadTracedRef?

Tony Hosking hosking at cs.purdue.edu
Wed Apr 22 04:37:36 CEST 2009


I believe there is just one check when the parameter is first passed.

On 22 Apr 2009, at 10:54, Jay wrote:

> Yes this helps, thank you. Maybe checkin under doc? One thing you  
> didn't understand my wording about and you seemed to contradict is  
> "VAR to VAR".
>
>
> PROCEDURE F1() =
> VAR i;
> BEGIN
>  F2(i);
> END F1;
>
>
> PROCEDURE F2(VAR a:INTEGER)=
> BEGIN
>  F3(i);
> END F2;
>
>
> PROCEDURE F3(VAR a:INTEGER)=
> BEGIN
>  a := 1;
> END F3;
>
>
> Where are the calls to CheckLoad/StoreTracedRef?
>   (Duh, I can try it out..)
> It seems redundant to put them "everywhere".
> Esp. it seems that F2 shouldn't have do anything.
>
>
> But throw in that F3 might extern and that is unclear.
> So I just made up a suggestion that VAR is not good to pass to C code.
> That checks are inserted when 1) a pointer is dereferenced -- loaded
> or stored in Modula-3 or 2) a
> pointer becomes untraced (var becomes untraced ref, among others).
>
>
>  > Right.  I think I did see you compute and pass an offset to C code
>
>
> I had something like that very recently but think I didn't set it or  
> commit it.
> I had something like:
>
>
> size_t offset;
>
>
> Init(Mutex* root, int* interior)
> {
>   offset = (size_t)interior - (size_t)root;
> }
>
>
> DoSomething(Mutex* anotherRoot)
> {
>   int* interior = (int*)(offset + (size_t)anotherRoot);
>  printf("%d\n", *interior);
> }
>
>
> but I came up with a way to avoid that I think, and then rolled it  
> all back or something anyway.
>
>
>
>  - Jay
>
> CC: m3devel at elegosoft.com
> From: hosking at cs.purdue.edu
> To: jay.krell at cornell.edu
> Subject: Re: [M3devel] explanation of CheckLoadTracedRef?
> Date: Wed, 22 Apr 2009 10:02:52 +1000
>
> On 21 Apr 2009, at 19:37, Jay wrote:
>
> > > How bad/unportable was it the previous way, the VM-synchronized  
> way?
> >
> > Not compatible with system threading.
>
>
> Really? NT386 wasn't VM-synchronized back in 3.6, 4.1, etc.?
>
> It was, but every system call had to be wrapped with a call to  
> acquire the global heap lock!
>
> Or only with great cost?
>
> Yes, taking the lock was expensive and prevented scaling on multi- 
> cores.
>
> I have to admit, those old import .libs, kernel32.lib, etc. I didn't  
> realize what was in them when I deleted them. I thought they were  
> just regular import .libs.
>
> There was also the work to wrap all dll symbols to acquire the lock.
>
> I think I got luckly in that the overlap between me deleting them  
> and you removing VM-synchronized GC was small or zero.
>
> You should have seen the mess it was before...  ;-)
>
> > You should *never* access a field in the heap in C code! All
> > accesses to traced fields in the heap must take place in Modula-3.
> > Otherwise things will break! C wrappers should not do anything other
> > than forward calls to C library calls. They should not perform heap
> > accesses.
>
>
> Ok, that makes sense.
> Important "out" is the accessing stack is always ok.
>
> Yes, so long as a references to an object is held on the stack then  
> it is safe to pass an address within it to external calls.  Thus,  
> many C functions can take VAR arguments that may end being  
> references to the fields of objects.  The compiler injects the  
> necessary CheckLoad/CheckStore operations when passing  VAR  
> parameters, etc., and the GC maintains the invariant that all stack- 
> referenced objects don't move, stay black (for incremental GC), and  
> remain dirty (for generational GC).
>
> But this is a requirement I didn't keep in mind.
> Now, luckily, the C wrappers are all relatively thin and not difficult
> to re-review in their entirety.
>
> Right.  I think I did see you compute and pass an offset to C code,  
> but that may have only been in code you e-mailed me for perusal  
> rather than code that got checked in.  Might be worth reviewing...
>
> But, take for example "open".
> The first parameter to it is bound to be in the heap.
>
> The ambiguous roots garbage collector we use maintains the invariant  
> that pointers to the interior of heap objects from the stack *pin*  
> that object in the heap: it will not move while the pointer from the  
> stack exists, and invariants will be maintained so that its contenst  
> can be manipulated safely even in the face of incremental and  
> generational GC.
>
> But probably it is untraced or somehow ok, since it does
> come from a module used primarily for C interop.
>
> Certainly, C code should never be manipulating the *traced* fields  
> of traced heap objects.  It is fine for it to manipulate the  
> untraced fields of traced heap objects.
>
> And, the line between C wrappers and the "C library" that they  
> forward to
> does not exist.
> If I, say, pass a VAR to a VAR..no check is made?
>
> Not sure what you mean by this.  Any call that passes traced VAR  
> params will generate a check as necessary before the call.
>
> Important to declare extern/C functions as taking UNTRACED REF and  
> not VAR?
>
> No, VAR is fine.  So long as the VAR value being passed is not traced.
>
> > I think you are confusing incremental and incremental GC.
> You assume I understand more than I do (I assume you have a typo. :))
>
> ;-)
>
> "generational" -- the concept that most objects die young.
>   (aka most objects could have been allocated on the stack...)
>
> Not quite.  The idea is that the likelihood of an object dying is a  
> function of its age.  There is a *weak* generational hypothesis that  
> "most objects die young", and a *strong* generational hypothesis  
> that "the older an object is the less likely it is to die".  Many  
> (but not all) programs support these hypotheses, which permits  
> generational GC to focus effort where it is likely to be profitable  
> (i.e., to free up a lot of space).
>
> But does that imply detailed implementation choices, or is just like  
> a "guiding principle"?
> I guess it implies the heap is split into at least two generations,  
> old and young.
> Though I guess in reality there is a range of young, less young,  
> lesser young, least young, etc.
>
> Right, many different collectors have exploited age in this way.   
> For the M3 collector we have just two generations: old and young.
>
> And that has objects age, they should be moved from young to old  
> heaps, and references to them either updated right away, or "caught"  
> upon use and updated then...something like that.
>
> Old-space objects are "clean" if they contain no references to young- 
> space objects.  The Modula-3 compiler injects checks to make sure  
> that whenever we store a reference into a clean old-space object it  
> is marked "dirty".  When a young-space collection occurs we must  
> process the references in dirty old-space objects as roots into the  
> young-space.
>
> "incremental" -- don't pause the world..
>
> Not quite.  The opposite of stop-the-world (stopping all the mutator  
> threads to process their stacks) is on-the-fly.  Incremental refers  
> to the ability to interleave GC work with mutator work.  If the GC  
> work can be interleaved with mutator threads without stopping the  
> mutator threads at each increment then the collector is said to be  
> concurrent (GC work proceeds concurrently with mutator work).
> The current M3 collector has a stop-the-world non-moving phase,  
> followed by an concurrent copying phase.  I have some incomplete  
> work that will also make the M3 collector on-the-fly (no STW phase)  
> and parallel (multiple GC threads can operate concurrently).
>
> Hope all this helps!
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20090422/dbed12f0/attachment-0002.html>


More information about the M3devel mailing list