[M3devel] cas/casp?

Tony Hosking hosking at cs.purdue.edu
Fri Dec 18 21:19:36 CET 2009


On 18 Dec 2009, at 14:32, Jay K wrote:

> Ah, ok.
>  
> btw another "common" primitive is InterlockedCompareExchange"Double".
> As in InterlockedCompareExchange64 that's been in x86 a while and InterlockedCompareExchange128 that is newer.
>  
> There are also often alignment requirements for these things, like to not cross cache line boundaries.
> We're already ok that way?
> At least for globals?

I think so.  Need to check.

>  
>  - Jay
> 
>  
> Subject: Re: [M3devel] cas/casp?
> From: hosking at cs.purdue.edu
> Date: Fri, 18 Dec 2009 14:12:44 -0500
> CC: m3devel at elegosoft.com
> To: jay.krell at cornell.edu
> 
> On 18 Dec 2009, at 14:06, Jay K wrote:
> 
> oh. I already got it working. ok.
> 
> Atomic confuses me in a few ways.
>  
> I find it terribly confusing on Windows that there are separate compiler and memory fences.
> I'd like there to be a very simple full fence that forces all reads/writes before it to finish before any reads/writes after it, both for the compiler and processor.
> 
> MemoryFence does that.  CompilerFence is strictly weaker than MemoryFence.
> 
> It makes no sense to me to allow the processor or compiler to reorder but not both.
>  
>  
> What is weak cas?
> 
> It's a CAS that can fail spuriously.  i.e., even if the old value was ==.
> 
> Is that loadlocked/storecondition and the caller can loop?
>  
>  
> Um. There is the Windows terminology InterlockedWhatever.
> There is the gccc builtins.
> Can we "sound" close to one of them?
> Or is this based on some other widely known system?
> 
> The Atomic interface I posted is based on a proposal for the C++ standard:
> 
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2047.html
> 
>  And we don't care about older x86 processors like 386?486 I assume?
> There seem to have been at least two leaps in functionality here.
> Old Win32 operating systems provide very few interlocked functions.
> They provide increment and decrement, that only return <0, 0, >0 of the new value.
> And Exchange. (This is based on the Visual C++ 2.0 winbase.h)
> All just on 32bit values.
>  
> Later on there is a much larger suite, Exchange, ExchangeAdd, Inc/Dec that return new value, all on 32 bit values.
>  
> And then later, stuff added for 16 values, bit operations, maybe 8 bit values, and, or, xor. I don't think this requires newer processors though.
>  
> And then also at some point InterlockedCompareExchange64 which requires the "newest" processors, but is still quite old.
> >From this you can synthesize inc/dec/and/or/xor with loops and that is done with inline functions in winbase.h.
>  
>  
>  - Jay
> 
>  
> > From: hosking at cs.purdue.edu
> > Date: Fri, 18 Dec 2009 13:32:57 -0500
> > To: jay.krell at cornell.edu
> > CC: m3devel at elegosoft.com
> > Subject: Re: [M3devel] cas/casp?
> > 
> > Hold your horses. I am in the middle of working up a more portable "Atomic" interface along the following lines. We can choose to implement with compiler intrinsics like Word/LongWord (if available), but otherwise will provide a call interface. I envision instantiating for ADDRESS, CHAR, and Word.T. This means CAS/CASP will go away.
> > 
> > GENERIC INTERFACE Atomic(Rep);
> > 
> > IMPORT RTMachine;
> > 
> > TYPE T = Rep.T;
> > 
> > PROCEDURE MemoryFence();
> > (* Guarantees explicit memory ordering for otherwise unordered atomics, and
> > for other memory references wrt atomics. Not useful for ordering
> > ordinary memory references, since those may not race and, if they don't
> > race, always appear to be ordered. *)
> > 
> > PROCEDURE CompilerFence();
> > (* Ensures that prior memory operations appear in the instruction stream
> > before subsequent ones, i.e. the compiler is not allowed to reorder
> > around this. This really has only implementation-defined semantics, but
> > it seems to be useful in ensuring ordering with respect to signal
> > handlers and the like. *)
> > 
> > PROCEDURE Store(VAR var: T; val: T); (* No ordering guarantees. *)
> > PROCEDURE StoreRelease(VAR var: T; val: T);
> > 
> > PROCEDURE Load(READONLY var: T): T; (* No ordering guarantees. *)
> > PROCEDURE LoadAcquire(READONLY var: T): T;
> > 
> > TYPE Order = { Raw, Acquire, Release, Ordered };
> > (* "Raw" ==> This operation is unordered, and may become visible to other
> > threads in an order that is constrained only by ordering constraints
> > on other operations.
> > 
> > "Release" ==> All prior memory operations (including ordinary
> > assignments) become visible to a an acquire operation on the same
> > object that sees the resulting value.
> > 
> > "Acquire" ==> See above.
> > 
> > "Ordered" ==> Both acquire and release ordering properties.*)
> > 
> > CONST HasCas = RTMachine.HasCas;
> > CONST HasWaitFreeCas = RTMachine.HasWaitFreeCas;
> > PROCEDURE Cas(VAR var: T; old, new: T; order := Ordered): T;
> > (* Most architectures provide a way to return the old value. On those that
> > do not, it can be emulated with an additional load, at the expense of
> > wait-freedom or spurious failure. *)
> > 
> > CONST HasWeakCas = RTMachine.HasWeakCas;
> > PROCEDURE WeakCas(VAR var: T; old, new: T; order := Ordered): T;
> > (* Similar to "Cas", but may fail spuriously, and must be wait-free, if
> > provided. *)
> > 
> > CONST HasFetchAdd = RTMachine.HasFetchAdd;
> > PROCEDURE FetchAdd(VAR var; incr: T; order := Raw): T;
> > 
> > CONST HasFetchOr = RTMachine.HasFetchOr;
> > PROCEDURE FetchOr(VAR var; mask: T; order := Raw): T;
> > 
> > CONST HasFetchAnd = RTMachine.HasFetchAnd;
> > PROCEDURE FetchOr(VAR var; mask: T; order := Raw): T;
> > 
> > END Atomic.
> > 
> > 
> > On 18 Dec 2009, at 13:04, Jay K wrote:
> > 
> > > Tony is this right?
> > > 
> > > 
> > > conceptually:
> > > 
> > > 
> > > result := CAS(VAR destination, compare, exchange)
> > > 
> > > 
> > > old := destination
> > > if old := compare then
> > > destination = exchange
> > > return old
> > > 
> > > 
> > > bool result := CASP(VAR destination, compare, exchange)
> > > P for predicate aka returning bool
> > > 
> > > 
> > > old := destination
> > > if old = compare then
> > > destination := exchange
> > > return true
> > > return false
> > > 
> > > 
> > > I'm pretty sure it is.
> > > 
> > > 
> > > I'll have this for 16 bit and 32 bit values shortly, for now requiring
> > > a recent Microsoft compiler and it generates function calls.
> > > But inlining should be easy enough.
> > > And then 8bit and 64bit.
> > > 
> > > 
> > > Also this will probably not work on 386 or 486 but require Pentium or somesuch.
> > > I don't know when various instructions were introduced but I know
> > > 386 is much poorer here than modern processors.
> > > 
> > > 
> > > I assume only support int8, uint8, int16, uint16, int32, uint32, int64, uint64, size_t, ptrdiff_t
> > > and not e.g. floating point.
> > > 
> > > 
> > > - Jay
> > 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20091218/eedceb6f/attachment-0002.html>


More information about the M3devel mailing list