[M3devel] Self() locking slots almost unnecessary -- if only we had "MemoryBarrier".

Jay K jay.krell at cornell.edu
Mon Nov 9 06:07:33 CET 2009


Can we adapt:

 Solaris /usr/include/atomic.h 
 windows.h Interlocked* 
 http://gee.cs.oswego.edu/dl/jmm/cookbook.html 
 http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html 


 

into something in m3core? 

Starting with the Atomic.i3 I just put in?


Probably each function should take some sort of void*
that is a pthread_mutex_t or Win32 CRITICAL_SECTION


and the fallback would use it?


The hard part would be:
  Well, maybe not much.
  There is the matter of compilers we don't have ready access to and gcc < 4.
  But we don't really support systems we don't have access to.
  After the current release I will be look into more ports, including possibly
    using other non-gcc compilers like on AIX, Irix, HP-UX.
  But again, a fallback to use a lock might not be terrible.
  If we have the systems, then we have the compiler, man pages, headers.
  If we don't have them, then we don't support them.
  The system is fairly portable but I'm not sure we deem it portable to systems
   we don't have. A fine line maybe.
  Besides, gcc is widespread AND we depend heavily on the gcc backend. It's not
   like we can easily have a backend on a system that doesn't have a gcc frontend.

  gcc < 4 on OpenBSD is probably the tough one. I've had OpenBSD/x86, OpenBSD/amd64,

  OpenBSD/powerpc, OpenBSD/sparc64 installed recently (all put ppc probably still installed).

  I only know offhand that OpenBSD 4.5/x86 uses gcc 3.3.5.

 


I realize "lock free" programming is dangerous and maybe not very profitable.
But I am still a moth drawn to the flame.


 

Ok to reduce the locking now of pthread slots?

(actually I have to run for a day or a week..)


 - Jay


 


From: jay.krell at cornell.edu
To: hosking at cs.purdue.edu
CC: m3devel at elegosoft.com
Subject: RE: [M3devel] Self() locking slots almost unnecessary -- if only we had "MemoryBarrier".
Date: Mon, 9 Nov 2009 04:23:49 +0000



How about a per thread never contented pthread_mutex_t?
As a fallback.
And add what we can as we do the research for others?
Anything using gcc would probably be supported right away.
Win32 would be supported.
Leaving only SOLgnu for now but probably we can find out what to do there.
 
Hm. a little bit of searching the web:
 
 
http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html
http://gee.cs.oswego.edu/dl/jmm/cookbook.html

It looks like:
 
#if defined(__sun)
#Include <atomic.h>
void Atomic__MemoryBarrier(void)
{
  membar_producer();
  membar_consumer();
}

#elif defined(__GNUC__)

void Atomic__MemoryBarrier(void)
{
  __sync_synchronize();
}

#elif defined(_WIN32)

void Atomic__MemoryBarrier(void)
{
  MemoryBarrier();
}
 
#else
 
#error  or consider uncontended pthread_mutex?
 
 
#endif

?
 
 - Jay

 


CC: m3devel at elegosoft.com
From: hosking at cs.purdue.edu
To: jay.krell at cornell.edu
Subject: Re: [M3devel] Self() locking slots almost unnecessary -- if only we had "MemoryBarrier".
Date: Sun, 8 Nov 2009 21:54:06 -0500





It would be nice to use CAS and friends (load-linked/store-conditional) but they are not portable.  It would require target-dependencies.


On 8 Nov 2009, at 19:44, Jay K wrote:

I don't know. I just look at all code overly critically..including for overly coarse grained locking (which includes some vs. none).
I guess the argument could be that the critical section -- the part of code that executes under the lock -- is very short, so it can't make much of a difference.
 
Writing the global with an "InterlockedExchange" might be good.
 
Maybe we should add this as a portably available interface?
"This" being MemoryBarrier and/or well, er, um, I guess you already did, the IA64 stuff, which is similar to the Win32 stuff.
I should update the NT/x86 backend for that stuff and then we can move on and use them.
 
 - Jay

 


From: hosking at cs.purdue.edu
To: jay.krell at cornell.edu
Date: Sun, 8 Nov 2009 19:39:00 -0500
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] Self() locking slots almost unnecessary -- if only we had "MemoryBarrier".





Not portably.  Different memory models will behave differently.  For safety we need the lock.  But, seriously, how much contention will there be?


On 8 Nov 2009, at 19:35, Jay K wrote:

Self() doesn't have to lock slots AS LONG AS in AssignSlot:
 
          SUBARRAY (new_slots^, 0, n) := slots^;
          slots := new_slots;

occurs in the order written.
That SUBARRAY() := finishes before slots := runs.
Aggressively compilers/processors need not execute these in the order written.
 
Do we have a way to guarantee that?
 
Something like:
          SUBARRAY (new_slots^, 0, n) := slots^;
>        MemoryBarrier();
          slots := new_slots;

?
 
MemoryBarrier on Windows is implemented as one "special" instruction -- for x86, AMD64, and IA64.
Those implementations are portable to any OS running those architectures.
Though they aren't expressed in a portable form (x86 inline assembly and C compiler intrinsincs).
In particular:
 
winnt.h:
amd64:
#define MemoryBarrier __faststorefence

x86:
FORCEINLINE
VOID
MemoryBarrier (
    VOID
    )
{
    LONG Barrier;
    __asm {
        xchg Barrier, eax
    }
}

ia64:
#define MemoryBarrier           __mf

 
 - Jay
 


 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20091109/b4e973da/attachment-0002.html>


More information about the M3devel mailing list