[M3devel] purported condition variable problems on Win32?

Jay jay.krell at cornell.edu
Tue Jul 19 03:53:24 CEST 2016


I wrote this version. Right, based on Schmidt. I also found similar code in Java. There are comments as to both of these in the code. 

Otherwise previous version I think had a very large lock.

Changing the < to !=, is that really ok? I get that 32bit overflow is possible.

We can change to a 64bit counter if < is required. That really can't rollover. I.e. Even a 64bit cycle counter can't rollover, and this would advance much slower.

I guess you are saying it would be unfair, which is ok.

There is pattern called "directed notification" or such that might be preferable? It has a certain significant inefficiency though, like creating an event per notify. I think it is what boost's condition variable are using.

I'll check on the unused event.

 - Jay

> On Jul 18, 2016, at 7:25 AM, "Rodney M. Bates" <rodney_bates at lcwb.coop> wrote:
> 
> 
> 
> 
> 
> I spent some time looking over the signal part of ThreadWin32.m3.  A few
> comments:
> 
> This looks like a very direct implementation of Schmidts' generation count
> algorithm, with a number of important little details additionally taken care
> of.
> 
> On small difference is at line 308 (B, below) where Schmidt does
> c.counter>count instead of #.  Usually, this would make no difference,
> since counter only increases, but on a long-running program, it could
> overflow.  If this happened,as it often does, as silent wrap-around to
> FIRST(INTEGER), some threads could be trapped for an extremely long
> time, waiting for counter come back around greater than their count.  As
> coded in ThreadWin32, they would be allowed to proceed normally.
> 
> The only glitch would be that, if there were simultaneously waiting threads
> separated by NUMBER(INTEGER) truly different generations (but the same value
> of c.counter), the recent ones could unfairly proceed along with the
> extremely old ones.  That seems extremely less likely than the
> mere existence of an overflow.
> 
> Both type Condition and type Activation have a field waitEvent: HANDLE.
> Condition.waitEvent is extensively used, but Activation.waitEvent is
> only created at :537 and deleted at :554, but never used in any real
> way.  It could be deleted.
> 
> As for this code:
> 
>    EnterCriticalSection(conditionLock);
> 
>      (* Capture the value of the counter before we start waiting.
>       * We will not stop waiting until the counter changes.
>       * That is, we will not stop waiting until a signal
>       * comes in after we start waiting.
>       *)
> 
>      count := c.counter;
>      INC(c.waiters);
> 
>    LeaveCriticalSection(conditionLock);
>    m.release(); (* can this be moved to before taking conditionLock? *)
> 
> No, it is important to sample and save c.counter in the order threads wait.
> Retaining m until this is done ensures this.  Otherwise, some other thread
> that waited later could have altered c.counter before this one gets conditionLock
> and saves its copy of c.counter.
> 
> I have made several comment changes to ThreadWin32.m3 that would have made
> it easier to vet.  I will commit these, but only comments.  It is either
> very difficult or impossible for me to test or even recompile this module,
> and, especially as fragile as this kind of code is, I don't want to commit
> any substantive changes untested and uncompiled.
> 
> More below:
> 
>> On 07/07/2016 04:24 AM, Jay K wrote:
>> So...I do NOT understand all of this.
>> 
>> 
>> However, comparing the three implementations
>> and attempting to understand ours,
>> I am struck by the following
>> 
>>  - Our broadcast seems ok, but
>>  - our signal seems to wake everyone,
>>    and then...they race a bit,
>>    one will decide it is last, and
>>    reset the event, but every prior waiter
>>    is still woken.
>> Why not merge the following chunks:
> 
> I don't think this will work.  It is important that, once a thread has decided,
> at A, that it can fairly proceed, it is guaranteed to be the next thread to proceed
> (i.e, to return from [Alert]Wait), before it decrements c.tickets at C, and, thus,
> if it is also the last of its generation to proceed, resets c.waitevent.  This
> is ensured by its having already reacquired m first, preventing any other thread
> from returning from its Wait yet.
> 
> And it can't attempt to acquire m while already holding conditionLock, because this would
> invite deadlock, since, elsewhere, a thread can try to acquire conditionLock while
> already holding m, at line 265.
> 
>> 
>> 
>>     WHILE (NOT alerted) AND (NOT waitDone) DO
>> .
>> .
>> .
>>  1
>>       EnterCriticalSection(conditionLock);
>>         waitDone := (c.tickets # 0 AND c.counter # count);
> A:---------^
> B:--------------------------------------------------^
>>       LeaveCriticalSection(conditionLock);
>>     END; (* WHILE *)
>>     IF waitDone THEN
>>       alerted := FALSE;
>>     END;
>>     m.acquire();
>> 2
>>     EnterCriticalSection(conditionLock);
>>       DEC(c.waiters);
>>       IF waitDone THEN   esp. here.
>>         DEC(c.tickets);
> 
> C:---------------^
>>         lastWaiter := (c.tickets = 0);
>>       END;
>>     LeaveCriticalSection(conditionLock);
>> 
>> 
>>  That is, if we decrement tickets earlier
>>  within the first critical section,
>>  while we will still wake everyone,
>>  only one will decide waitDone and the rest will keep looping.
>>  A downside of this, perhaps, is that waking all for Broadcast
>>  might be a little slower.
>>  but more so, in both the "ptw32" (pthreads for win32) and Boost threads
>>  implementations, they seem to deal with this differently than us,
>>  and they each do about the same thing -- they use a counted semaphore.
>>  In the boost case, it appears they duplicate the semaphore for every
>>  notification generation, which seems expensive.
>> 
>>  Perhaps if they can guarantee some lifetimes, they don't need to duplicate it.
>>  Or they can do their own reference counting?
>>  jdk7 seems to looke like jdk6.
>>  The code is gone in jdk8 and I can't easily find the delete in history.
>>   The lock merging jdk does probably helps here too.
>>   In fact they merge #1 and #2 above as a result.
>>   But they still initially wake all threads for signal, not just broadcast.
>>   There is also the problem that all the event waits
>>   are followed by EnterCriticalSection (or jdk facsimile).
>>   - Jay
>> 
>> 
>> 
>> 
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
> ---
>> From: jay.krell at cornell.edu
>> To: m3devel at elegosoft.com
>> Subject: RE: [M3devel] purported condition variable problems on Win32?
>> Date: Tue, 5 Jul 2016 08:26:36 +0000
>> 
>> Here is another implementation to consider:
>> 
>> https://github.com/boostorg/thread/blob/develop/include/boost/thread/win32/condition_variable.hpp
>> 
>>  - Jay
>> 
>> 
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
> ---
>> From: jay.krell at cornell.edu
>> To: m3devel at elegosoft.com
>> Date: Tue, 5 Jul 2016 08:19:23 +0000
>> Subject: [M3devel] purported condition variable problems on Win32?
>> 
>>  https://sourceforge.net/p/pthreads4w/code/ci/master/tree/README.CV
>> 
>>  vs.
>> 
>> http://www.cs.wustl.edu/~schmidt/win32-cv-1.html
>>  vs.
>>  https://modula3.elegosoft.com/cgi-bin/cvsweb.cgi/cm3/m3-libs/m3core/src/thread/WIN32/ThreadWin32.m3?rev=1.210.2.1;content-type=text%2Fplain
>>  I wrote this:
>>  PROCEDURE XWait(m: Mutex; c: Condition; act: Activation;
>>                 alertable: BOOLEAN) RAISES {Alerted} =
>>   (* LL = m on entry and exit, but not for the duration
>>    * see C:\src\jdk-6u14-ea-src-b05-jrl-23_apr_2009\hotspot\agent\src\os\win32\Monitor.cpp
>>    * NOTE that they merge the user lock and the condition lock.
>>    * http://www.cs.wustl.edu/~schmidt/win32-cv-1.html
>>    *   "3.3. The Generation Count Solution"
>>    *)
>> 
>>  Do we have the problems described in README.CV?
>> 
>>  I haven't looked through the ACE code to see
>>  to what extent they resemble solution 3.3, or if they
>>  changed as a result of this discussion -- which I admit I don't understand
>> and haven't read closely.
>> 
>> 
>> Spurious wakeups are ok, though should be minimized.
>> 
>> I'd still rather not drop pre-Vista support but I realize it becomes more interesting as time advances.
>> 
>>  Thank you,
>>   - Jay
>> 
>> _______________________________________________ M3devel mailing list M3devel at elegosoft.com https://m3lists.elegosoft.com/mailman/listinfo/m3devel
>> 
>> 
>> _______________________________________________
>> M3devel mailing list
>> M3devel at elegosoft.com
>> https://m3lists.elegosoft.com/mailman/listinfo/m3devel
> 
> -- 
> Rodney Bates
> rodney.m.bates at acm.org
> 
> 



More information about the M3devel mailing list