[M3devel] Self() locking slots almost unnecessary -- if only we had "MemoryBarrier".

Tony Hosking hosking at cs.purdue.edu
Mon Nov 9 17:05:44 CET 2009


Hi Jay,

Please note that we already have support in the compiler (middle and  
back ends) for these primitives.  They manifest as calls to the  
functions defined as intrinsics to gcc (e.g., synch_synchronize) so  
they can be inlined where appropriate.  On platforms that don't  
support the intrinsics (such as non-gcc backends) all we need to do is  
to implement the corresponding functions in C (much as you have  
started to do with the atomic.c).  What I propose is that we add a  
MEMBAR builtin to the Modula-3 compiler (to add to the existing CAS  
and CASP builtins that I have already implemented) which will bottom  
out as calls to these intrinsic functions.  My only concern is that we  
should have better support for the other forms of primitive (other  
than CAS) that are widely available, notably load-linked/store- 
conditional.  These are readily implemented using CAS, but not the  
other way around (load-linked => no-op, store-conditional => CAS).   
Unfortunately, LL/SC are not implemented by the gcc intrinsics.

On the other hand, I still question your motivation for avoiding the  
global lock on the slots table.  Most of these are for simple lookups  
(as opposed to insertions), which have very short critical sections:

   lock slots
     load slots[me.slot]
   unock slots

So, the chances of contention are *very* low.  A lightly contended  
pthread_mutex will typically involve spinning at worst, so I doubt  
there will be much of any benefit to your proposal, at the cost of  
introducing target dependencies into the code.  I am concerned that  
this is premature and non-portable optimization.

Nevertheless, I applaud considering use of non-blocking  
synchronization primtives in the threads implementations, which may  
ultimately have benefit to our implementation of Thread.Mutex and  
Thread.Condition (we can avoid inflating a pthread_mutex_t except when  
there is contention).

So, the upshot is that I think the approach you are taking needs  
revising, and you should think hard about whether what you are trying  
to do with ThreadWin32 will lead to real performance improvements at  
the cost of code clarity.

Antony Hosking | Associate Professor | Computer Science | Purdue  
University
305 N. University Street | West Lafayette | IN 47907 | USA
Office +1 765 494 6001 | Mobile +1 765 427 5484




On 8 Nov 2009, at 23:23, Jay K wrote:

> How about a per thread never contented pthread_mutex_t?
> As a fallback.
> And add what we can as we do the research for others?
> Anything using gcc would probably be supported right away.
> Win32 would be supported.
> Leaving only SOLgnu for now but probably we can find out what to do  
> there.
>
> Hm. a little bit of searching the web:
>
>
> http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html
> http://gee.cs.oswego.edu/dl/jmm/cookbook.html
>
> It looks like:
>
> #if defined(__sun)
> #Include <atomic.h>
> void Atomic__MemoryBarrier(void)
> {
>   membar_producer();
>   membar_consumer();
> }
>
> #elif defined(__GNUC__)
>
> void Atomic__MemoryBarrier(void)
> {
>   __sync_synchronize();
> }
>
> #elif defined(_WIN32)
>
> void Atomic__MemoryBarrier(void)
> {
>   MemoryBarrier();
> }
>
> #else
>
> #error  or consider uncontended pthread_mutex?
>
>
> #endif
>
> ?
>
>  - Jay
>
>
> CC: m3devel at elegosoft.com
> From: hosking at cs.purdue.edu
> To: jay.krell at cornell.edu
> Subject: Re: [M3devel] Self() locking slots almost unnecessary -- if  
> only we had "MemoryBarrier".
> Date: Sun, 8 Nov 2009 21:54:06 -0500
>
> It would be nice to use CAS and friends (load-linked/store- 
> conditional) but they are not portable.  It would require target- 
> dependencies.
>
> On 8 Nov 2009, at 19:44, Jay K wrote:
>
> I don't know. I just look at all code overly critically..including  
> for overly coarse grained locking (which includes some vs. none).
> I guess the argument could be that the critical section -- the part  
> of code that executes under the lock -- is very short, so it can't  
> make much of a difference.
>
> Writing the global with an "InterlockedExchange" might be good.
>
> Maybe we should add this as a portably available interface?
> "This" being MemoryBarrier and/or well, er, um, I guess you already  
> did, the IA64 stuff, which is similar to the Win32 stuff.
> I should update the NT/x86 backend for that stuff and then we can  
> move on and use them.
>
>  - Jay
>
>
> From: hosking at cs.purdue.edu
> To: jay.krell at cornell.edu
> Date: Sun, 8 Nov 2009 19:39:00 -0500
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] Self() locking slots almost unnecessary -- if  
> only we had "MemoryBarrier".
>
> Not portably.  Different memory models will behave differently.  For  
> safety we need the lock.  But, seriously, how much contention will  
> there be?
>
> On 8 Nov 2009, at 19:35, Jay K wrote:
>
> Self() doesn't have to lock slots AS LONG AS in AssignSlot:
>
>           SUBARRAY (new_slots^, 0, n) := slots^;
>           slots := new_slots;
>
> occurs in the order written.
> That SUBARRAY() := finishes before slots := runs.
> Aggressively compilers/processors need not execute these in the  
> order written.
>
> Do we have a way to guarantee that?
>
> Something like:
>           SUBARRAY (new_slots^, 0, n) := slots^;
> >        MemoryBarrier();
>           slots := new_slots;
>
> ?
>
> MemoryBarrier on Windows is implemented as one "special" instruction  
> -- for x86, AMD64, and IA64.
> Those implementations are portable to any OS running those  
> architectures.
> Though they aren't expressed in a portable form (x86 inline assembly  
> and C compiler intrinsincs).
> In particular:
>
> winnt.h:
> amd64:
> #define MemoryBarrier __faststorefence
>
> x86:
> FORCEINLINE
> VOID
> MemoryBarrier (
>     VOID
>     )
> {
>     LONG Barrier;
>     __asm {
>         xchg Barrier, eax
>     }
> }
>
> ia64:
> #define MemoryBarrier           __mf
>
>
>  - Jay
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20091109/e7b688bc/attachment-0002.html>


More information about the M3devel mailing list