[M3devel] firming up atomics?
Jay K
jay.krell at cornell.edu
Tue Oct 19 08:48:32 CEST 2010
Tony, hm, the full generality, in the backend, I think it should be a parameter whether or not the old value or new value or both or neither are returned.
See how the Microsoft intrinsics return the old value, but the x86 instructions return the new value, necessitating compare/exchange loops, unless the value is ignored, in which case much more efficient:
"v" for void:
void __fastcall vor(long a, long b) { _InterlockedOr(a, b); }
long __fastcall or(long a, long b) { return _InterlockedOr(a, b); }
void __fastcall vxor(long a, long b) { _InterlockedXor(a, b); }
long __fastcall xor(long a, long b) { return _InterlockedXor(a, b); }
void __fastcall vand(long a, long b) { _InterlockedAnd(a, b); }
long __fastcall and(long a, long b) { return _InterlockedAnd(a, b); }
C:\>cl -c 1.c -FAsc -Ox && more 1.cod
; 3 : void __fastcall vor(long a, long b) { _InterlockedOr(a, b); }
00000 f0 09 11 lock or DWORD PTR [ecx], edx
00003 c3 ret 0
; 4 : long __fastcall or(long a, long b) { return _InterlockedOr(a, b); }
00010 56 push esi
00011 8b 01 mov eax, DWORD PTR [ecx]
$LN3@:
00013 8b f0 mov esi, eax
00015 0b f2 or esi, edx
00017 f0 0f b1 31 lock cmpxchg DWORD PTR [ecx], esi
0001b 75 f6 jne SHORT $LN3@
0001d 5e pop esi
0001e c3 ret 0
; 6 : void __fastcall vxor(long a, long b) { _InterlockedXor(a, b); }
00020 f0 31 11 lock xor DWORD PTR [ecx], edx
00023 c3 ret 0
; 7 : long __fastcall xor(long a, long b) { return _InterlockedXor(a, b); }
00030 56 push esi
00031 8b 01 mov eax, DWORD PTR [ecx]
$LN3@:
00033 8b f0 mov esi, eax
00035 33 f2 xor esi, edx
00037 f0 0f b1 31 lock cmpxchg DWORD PTR [ecx], esi
0003b 75 f6 jne SHORT $LN3@
0003d 5e pop esi
0003e c3 ret 0
; 9 : void __fastcall vand(long a, long b) { _InterlockedAnd(a, b); }
00040 f0 21 11 lock and DWORD PTR [ecx], edx
00043 c3 ret 0
; 10 : long __fastcall and(long a, long b) { return _InterlockedAnd(a, b); }
00050 56 push esi
00051 8b 01 mov eax, DWORD PTR [ecx]
$LN3@:
00053 8b f0 mov esi, eax
00055 23 f2 and esi, edx
00057 f0 0f b1 31 lock cmpxchg DWORD PTR [ecx], esi
0005b 75 f6 jne SHORT $LN3@
0005d 5e pop esi
0005e c3 ret 0
@and at 8 ENDP
_TEXT ENDS
END
- Jay
> Subject: Re: [M3devel] firming up atomics?
> From: hosking at cs.purdue.edu
> Date: Sun, 17 Oct 2010 08:48:09 -0400
> CC: m3devel at elegosoft.com
> To: jay.krell at cornell.edu
>
> I still have some fixes to check in before turning it on again.
> They are intended to match the approved C++0x standard.
>
> On 16 Oct 2010, at 20:27, Jay K wrote:
>
> >
> > Tony,
> >
> > Can we firm up the atomic semantics/implementatino?
> >
> > Now might be a good time for me to make the NT/x86 backend
> > definitely work, test it, and make it more efficient.
> > (ie. not using InterlockedCompareExchange loop for InterlockedIncrement when a simple xadd will work).
> >
> >
> > Like, old value vs. new value returned?
> >
> >
> > Off the cuff, I'd say, better to always return old value.
> > The new value can always be computed from it.
> >
> > However we are the mercy of the processors.
> >
> >
> > And it only matters for or/and.
> > xor/add/sub/inc/dec can all compute either from either.
> >
> >
> > Also, please, I don't find the specifications clear.
> > At least the last time I studied them.
> >
> >
> > To me they should say, things like,
> > "returns the old value"
> > or "returns the new value"
> >
> >
> > or returns true if compare matched
> > or returns false if the compare matched
> > etc.
> >
> >
> > The mentions of "spurious failure" or whatnot didn't make sense either.
> > They should say, something like:
> > returns true if the compare matched
> > but sometimes even if the compare should have matched,
> > it won't, and false will be returned -- specify everything that happens when there is "failure".
> >
> >
> > Furthermore, if you look at what Microsoft compilers provide,
> > they provide a bunch of very simple very clear well documented functions,
> > that the compiler implements and inlines.
> > (at least if you ignore the qcuire/release stuff that confuses me...
> >
> >
> > You could specify in terms of them.
> > Really. A good idea.
> >
> >
> > ie: is our compare_exchange equivalent to InterlockedCompareExchange?
> > Or maybe parameter order different?
> >
> >
> > Of course, I understand, the types supported by the Microsoft
> > intrinsics have varied through time and slowly grown.
> > 8bit and 16bit might still be lacking but 32bit and 64bit are very
> > complete. 64bit atomic operations on 32bit x86 processors
> > since Pentium or so can be synthesized with a loop over
> > InterlockedCompareExchange64.
> >
> >
> > Or in terms of the gcc intrinsics, probably similarly ok.
> >
> >
> > We should probably also nail down the implementation where
> > they aren't available.
> >
> >
> > And maybe put in safer defaults?
> > And quickly improve where we can, e.g. on all x86/AMD64 platforms.
> >
> >
> > Thanks,
> > - Jay
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20101019/d4c27182/attachment-0002.html>
More information about the M3devel
mailing list