[M3devel] Juno/Thread/Win32 notes

Jay K jay.krell at cornell.edu
Thu Oct 22 07:26:03 CEST 2009


 > It seems like we should have Mutex and RecursiveMutex, have either work with Condition


Understood that Wait would have to assert that RecursiveMutex is only locked to depth 1 or somesuch.

 

 - Jay
 


From: jay.krell at cornell.edu
To: hosking at cs.purdue.edu
Date: Thu, 22 Oct 2009 05:21:28 +0000
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] Juno/Thread/Win32 notes




 > What are the Misc/misc calls?
 
Trestle stuff.
Search in m3-ui.
It is in "Misc" that Juno calls .Signal(untilDone).
Sometimes the entire "misc" call is skipped, or maybe occurs fewer times. That is when it hangs.
I'm pretty sure.
 
 > The corruption comes outside of GC.  It's just that the GC dicovers it.
 
Agreed.
 
 > The only reason to have it in the CV is to achieve the optimization,
 > so you know to which mutex queue we should transfer the signalled/broadcast thread. 
 
Of course. I should have realized that.
Um..the data is available anyway though, isn't it?
The thread had to call wait(mutex, condition) in order for signal(condition) to unlink him from the condition's waiters, the waiting thread can store the mutex in itself (he can only be waiting on one condition variable at a time) and the signaling thread can grab that.
I think.
 
 
In either case, I'm not too inclined to touch it for this release until we are sure there is a bug, and that looks unlikely at this point. We could try putting back your BroadcastHeap change?
 
 
It also seems a little gross that the heap lock is a special case recursive mutex.
It seems like we should have Mutex and RecursiveMutex, have either work with Condition, and have LockHeap/UnlockHeap/BroadcastHeap just be wrappers around that.
 
 
More generally we should probably try to embelish Thread.i3 with more primitives I think. At least "interlocked" and "once", if not "event" and "semaphore".
"once" could definitely see a fair amount of use.
The rest maybe not, maybe they are too primitive and hard to get good use out of, hard to define portably/efficiently.
 
 
Or maybe LockHeap/UnlockHeap recursive support could/should be removed and just stick with Mutex and Condition and nothing else?
 
 
Well, reader/writer locks are nice.
 
 
The "Little Book of Semaphores" seems to suggest ideas like "turnstile", "rendevous" and maybe others, though I'm not sure they are actually easy to think about (I don't seem able to keep up with the author :( ), and possibly a generalization of reader/writer, like an n/m/o lock where you have x classes of code and different numbers of them are allowed to have the lock. Reader/writer is 2 classes with unlimited/1 access. 
There is a case of read/insert/delete that can scale a bit better than reader/writer.
 
And maybe a "thread local" mechanism?
 
 
 - Jay
 


CC: m3devel at elegosoft.com
From: hosking at cs.purdue.edu
To: jay.krell at cornell.edu
Subject: Re: [M3devel] Juno/Thread/Win32 notes
Date: Wed, 21 Oct 2009 21:56:21 -0400





On 21 Oct 2009, at 17:02, Jay K wrote:


ThreadWin32.m3 almost exactly matches Birrel's design.
The order of two unlocks is reversed. It probably doesn't matter.
He says LockMutex/UnlockMutex are just P/V. Ours is a bit different.
We have queueing on our locks which appears unncessary, but is maybe
with in mind an optimization mentioned but now shown by Birrel --
that of Signal with a lock held transfering a thread right
to the mutex's wait list.



The queueing on mutexes does appear unnecessary, unless the optimization can be achieved.


 Birrel also has one mutex per condition variable but that
seems to maybe be incidental and not architectural.



The only reason to have it in the CV is to achieve the optimization, so you know to which mutex queue we should transfer the signalled/broadcast thread.


 I believe there is much room for performance improvement
here, but if it is correct, it is ok for this release.
As well, I believe lock/conditionvariables can be made to work
in threads not created by the Modula-3 runtime, at least with
a dependency on NT4 or Win2000 (QueueUserAPC to trigger alert).
(owner = GetCurrentThreadId instead of T).



Sure.


I put in a bunch of RTIO in Juno.
Whenever it hangs, the Signal call doesn't actually occur.
We have to try to figure out why.
It is something in the Misc/misc calls.



What are the Misc/misc calls?



 So ThreadWin32.m3 is fairly well vindicated (except
maybe via some roundabout fashion).
 

The heap corruption is now fairly rare in Juno.
I don't know if it is consistent or not.
The contents are, not sure of the address.
I'll debug more.
 

My next idea..since I think the corruption involves
copying a pixmap over other data, is to alter all
pixmaps to be composed of specific data, see if that
occurs in the corruption, to try to confirm this
part of my theory.
 

If that holds, then the memcpys done by gc could
check for the pattern (actually they could check
anyway, I thought I tried that already, will try again).



The corruption comes outside of GC.  It's just that the GC dicovers it.  This should really be straightforward to catch, if you can make it occur deterministically.


 As well, with ThreadWin32.m3 having gotten some fixes,
getting various timestamps of the source tree might be
a good idea.
 

With ThreadWin32.m3 fairly well vindicated, we might
declare the quality is high enough asis. ?
But I'd like to keep investigating.
(Anyone else can?)
 

The doubt imho is now more cast upon the Win32 Trestle code.
We might even try Cygwin configured to use Win32 threads
and X Windows and see if that has the same bug.
 
 
Later..
 - Jay

 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20091022/f2408276/attachment-0002.html>


More information about the M3devel mailing list