[M3devel] Yet another bug in Win-32 generation count threads

Rodney M. Bates rodney_bates at lcwb.coop
Thu Jul 21 04:00:50 CEST 2016


If I were to try to fix the Thread condition variables on
Windows, I would consider a scheme entirely different from
any of those given in Schmidt.

Associate the Win-event that a M3-Wait operation uses with a
thread, not a condition variable.  Since a thread can wait
on at most one condition at a time, the Win-event will have
at most one waiting thread.  This also means allocation of
the Win-events is easy: put one in each Activation (which
is one-to-one with a thread).  (Perhaps the unused waitEvent
in an activation was left over from some variant of this
Scheme?)

Put the Activations of the multiple threads waiting on a Condition
in some kind of linked list rooted at the Condition, with
link pointers in the Activations.  This makes it easy to code
almost any wakeup fairness policy in a single place, with
single-threaded code: removal from this list, rather than
complex interactions among a signaller and many waiter threads.
Or, alternatively, it could be done when inserting into the list.
Simple FIFO is the obvious choice, but anything could be easily
implemented., such as thread priorities.

Making the order of reacquisition of the mutex match the order
of SetEvent would require more complex work, but shouldn't
be too hard.

It is an easy-to-maintain invariant that a waiter is waiting
on a Win-event iff its Activation is on the list of some condition.
This means a signaller will always know a Win-event it is about
to call SetEvent on has a thread waiting in it, so can do a self-
resetting SetEvent and never worry that the event will remain
signalled afterwards.

With these restrictions on their use, Win-events behave just
like condition variables, greatly simplifying things.  All
the solutions Schmidt describes suffer from problems originating
in having muliple waiters Win-waiting on a single Win-event.

One detail that could require some thought is a waiting thread's
Activation needs to be removed from its list when the thread
is destroyed.

On 07/20/2016 01:00 PM, Rodney M. Bates wrote:
> The more I look at this generation count approach to implementing
> Modula-3 condition variables on top of what Windows has, the less
> I like it.  It's very hard to understand, but as I get closer,
> I think I see yet another bug.
>
> Assume ticket counting is fixed.
>
> There are several waiting threads, say 5.  c.waiters = 5 and
> c.counter = 0.  A M3-Broadcast happens.  It does
> (Win-)SetEvent(c.waitEvent), sets c.counter := 1, and
> c.tickets := 5, making all the waiters eligible to get
> past the waitDone test, which they start doing.  From the
> abstract M3-condition variable point of view, all have
> already been released from the Condition variable, but
> have some cleanup to do inside (M3-)Wait.
>
> One by one, they proceed, say 3 of them.  Two waiters
> remain, c.waiters = c.tickets = 2, and all have counts
> that make them eligible for waitDone.
>
> Now, two more waiters do M3-Waits.  c.counter still = 1, and
> their local counts = 1, making them ineligible to get
> past waitDone.  Then a M3-Signal happens.  c.counter
> gets incremented to 2, so all all 4 waiters are now eligible
> to pass that test.  c.tickets gets increased to 3, supposedly
> so the two that were broadcast-released and one of the new
> ones can proceed.
>
> But it is randomly possible that both the new ones get
> released (which would be OK), taking two of the tickets,
> one of the older ones get released, taking the last ticket,
> and the remaining is stuck waiting when it should not, for
> lack of a ticket.
>

-- 
Rodney Bates
rodney.m.bates at acm.org



More information about the M3devel mailing list