[M3devel] Per thread data

Darko darko at darko.org
Wed Sep 15 05:00:34 CEST 2010


No, only the subtype has to be revealed. I think both approaches have their usefulness, the Get/Set is useful for libraries.

On 14/09/2010, at 7:23 PM, Jay K wrote:

> It's not a hash lookup. It is a direct array index.
> Disassemble kernel32!TlsGetValue.
> It's much cheaper than a hash lookup, much more expensive than reading a global or local.
>  
>  
> on Windows 7 x86:
>  
> \bin\x86\cdb cmd
> 0:000> u kernel32!TlsGetValue
> kernel32!TlsGetValue:
> 750111cd ff2510080175    jmp     dword ptr [kernel32!_imp__TlsGetValue (75010810)]
> 
> 0:000> u poi(75010810)
> KERNELBASE!TlsGetValue:
> 76532c95 8bff            mov     edi,edi
> 76532c97 55              push    ebp
> 76532c98 8bec            mov     ebp,esp
> 76532c9a 64a118000000    mov     eax,dword ptr fs:[00000018h] ; get per thread data base
> 76532ca0 8b4d08          mov     ecx,dword ptr [ebp+8]
> 76532ca3 83603400        and     dword ptr [eax+34h],0
> 76532ca7 83f940          cmp     ecx,40h ; compare index to 64
> 76532caa 7309            jae     KERNELBASE!TlsGetValue+0x20 (76532cb5) ; if above, goto 76532cb5
> 76532cac 8b8488100e0000  mov     eax,dword ptr [eax+ecx*4+0E10h] ; get the actual value
> 76532cb3 eb14            jmp     KERNELBASE!TlsGetValue+0x34 (76532cc9) ; goto end
> 76532cb5 81f940040000    cmp     ecx,440h ; compare to 1088
> 76532cbb 7210            jb      KERNELBASE!TlsGetValue+0x38 (76532ccd) if below, goto 76532ccd
> 76532cbd 680d0000c0      push    0C000000Dh  ; invalid parameter
> 76532cc2 e86b390200      call    KERNELBASE!BaseSetLastNTError (76556632)
> 76532cc7 33c0            xor     eax,eax ; return 0 for failure or for TlsSetValue not called 
> 76532cc9 5d              pop     ebp
> 76532cca c20400          ret     4
> 76532ccd 8b80940f0000    mov     eax,dword ptr [eax+0F94h] ; get data base for values > 64
> 76532cd3 85c0            test    eax,eax ; compare to null
> 76532cd5 74f0            je      KERNELBASE!TlsGetValue+0x32 (76532cc7) ; if null, goto 76532cc7, which returns 0, this is if you have calls TlsAlloc but not TlsSetValue
> 76532cd7 8b848800ffffff  mov     eax,dword ptr [eax+ecx*4-100h] ; get the value for index > 64 (subtracting 64*4)
> 76532cde ebe9            jmp     KERNELBASE!TlsGetValue+0x34 (76532cc9) ; goto end
> 
>  
>  
> But your proposal might be reasonable anyway.
> Except, wouldn't Thread.T have to be revealed in an .i3 file?
>  
>  
>  - Jay
> 
> 
>  
> > From: darko at darko.org
> > Date: Tue, 14 Sep 2010 17:38:20 -0700
> > To: jay.krell at cornell.edu
> > CC: m3devel at elegosoft.com
> > Subject: Re: [M3devel] Per thread data
> > 
> > The issue I see is performance. That requires at least a hash lookup and will have performance nothing like a global variable.
> > 
> > I'd like to change the Thread interface so that Fork takes a parameter of a typecode which must be a subtype of Thread.T and allocates that if specified. Assuming Thread.Self() is not slow that should perform much better. Anyone see any problems with that?
> > 
> > 
> > On 14/09/2010, at 7:05 AM, Jay K wrote:
> > 
> > > 
> > > Eh? Just one thread local for the entire process? I think not.
> > > 
> > > More like:
> > > 
> > > PROCEDURE AllocateThreadLocal(): INTEGER;
> > > PROCEDURE GetThreadLocal(INTEGER):REFANY;
> > > 
> > > PROCEDURE SetThreadLocal(INTEGER;REFANY);
> > > 
> > > 
> > > or ThreadLocalAllocate, ThreadLocalGet, ThreadLocalSet.
> > > The first set of names sounds better, the second "scales" better.
> > > This seems like a constant dilemna.
> > > 
> > > btw, important point I just remembered: unless you do extra work,
> > > thread locals are hidden from the garbage collector.
> > > 
> > > This is why the thread implementations seemingly store extra data.
> > > The traced data is in globals, so the garbage collector can see them.
> > > 
> > > - Jay
> > > 
> > > ________________________________
> > >> From: darko at darko.org
> > >> Date: Tue, 14 Sep 2010 06:13:26 -0700
> > >> To: jay.krell at cornell.edu
> > >> CC: m3devel at elegosoft.com
> > >> Subject: Re: [M3devel] Per thread data
> > >> 
> > >> I think a minimalist approach where you get to store and retrieve one
> > >> traced reference per thread would do the trick. If people want more
> > >> they can design their own abstraction on top of that. Maybe just add
> > >> the following to the Thread interface:
> > >> 
> > >> PROCEDURE GetPrivate(): REFANY;
> > >> PROCEDURE SetPrivate(ref: REFANY);
> > >> 
> > >> 
> > >> On 14/09/2010, at 5:59 AM, Jay K wrote:
> > >> 
> > >> Tony -- then why does pthread_get/setspecific and Win32 TLS exist?
> > >> What language doesn't support heap allocation were they designed to support?
> > >> It is because code often fails to pass all the parameters through all
> > >> functions.
> > >> 
> > >> Again the best current answer is:
> > >> #ifdefed C that uses pthread_get/setspecific / Win32
> > >> TlsAlloc/GetValue/SetValue, ignoring user threads/OpenBSD.
> > >> 
> > >> As well, you'd get very very far with merely:
> > >> #ifdef _WIN32
> > >> __declspec(thread)
> > >> #else
> > >> __thread
> > >> #endif
> > >> 
> > >> Those work adequately for many many purposes, are more efficient, much
> > >> more convenient, and very portable.
> > >> I believe there is even an "official" C++ proposal along these lines.
> > >> 
> > >> We could easily abstract this -- the first -- into Modula-3 and then
> > >> support it on user threads as well.
> > >> Can anyone propose something?
> > >> It has to go in m3core, as that is the only code that (is supposed to)
> > >> know which thread implementation is in use.
> > >> 
> > >> - Jay
> > >> 
> > >> 
> > >>> From: darko at darko.org
> > >>> Date: Tue, 14 Sep 2010 05:34:59 -0700
> > >>> To: hosking at cs.purdue.edu
> > >>> CC: m3devel at elegosoft.com
> > >>> Subject: Re: [M3devel] Per thread data
> > >>> 
> > >>> That's the idea but each object can only call another object
> > >> allocated for the same thread, so it needs to find the currently
> > >> running thread's copy of the desired object.
> > >>> 
> > >>> On 14/09/2010, at 5:08 AM, Tony Hosking wrote:
> > >>> 
> > >>>> If they are truly private to each thread, then allocating them in
> > >> the heap while still not locking them would be adequate. Why not?
> > >>>> 
> > >>>> On 14 Sep 2010, at 01:08, Darko wrote:
> > >>>> 
> > >>>>> I have lots of objects that are implemented on the basis that no
> > >> calls on them can be re-entered, which also avoids the need for locking
> > >> them in a threaded environment, which is impractical. The result is
> > >> that I need one copy of each object in each thread. There is
> > >> approximately one allocated object per object type so space is not a
> > >> big issue. I'm looking at a small number of threads, probably maximum
> > >> two per processor core. With modern processors I'm assuming that a
> > >> linear search through a small array is actually quicker that a hash
> > >> table.
> > >>>>> 
> > >>>>> On 13/09/2010, at 9:55 PM, Mika Nystrom wrote:
> > >>>>> 
> > >>>>>> Darko writes:
> > >>>>>>> I need to have certain data structures allocated on a per thread
> > >> basis. =
> > >>>>>>> Right now I'm thinking of using the thread id from ThreadF.MyId() to =
> > >>>>>>> index a list. Is there a better, more portable way of allocating
> > >> on a =
> > >>>>>>> per-thread basis?
> > >>>>>>> 
> > >>>>>>> Cheers,
> > >>>>>>> Darko.
> > >>>>>> 
> > >>>>>> In my experience what you suggest works just fine (remember to lock the
> > >>>>>> doors, though!) But you can get disappointing performance on some
> > >> thread
> > >>>>>> implementations (ones that involve switching into supervisor mode more
> > >>>>>> than necessary when accessing pthread structures).
> > >>>>>> 
> > >>>>>> Generally speaking I avoid needing per-thread structures as much
> > >> as possible
> > >>>>>> and instead put what you need in the Closure and then pass
> > >> pointers around.
> > >>>>>> Of course you can mix the methods for a compromise between speed and
> > >>>>>> cluttered code...
> > >>>>>> 
> > >>>>>> I think what you want is also not a list but a Table.
> > >>>>>> 
> > >>>>>> Mika
> > >>>>> 
> > >>>> 
> > >>> 
> > >> 
> > > 
> > 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100914/552b6253/attachment-0002.html>


More information about the M3devel mailing list