[M3devel] Per thread data

Jay K jay.krell at cornell.edu
Tue Sep 14 22:38:24 CEST 2010


Thread.Local ?

Also, if you are doing this, can you add NoInline?
We are letting gcc be more aggressive lately.
I want NoInline for the toplevel function in
the thread implementations.


I could probably fake it by making a call
to <*EXTERN*> setjmp.


It might also be nice to parallel more of C/gcc's features.
<*volatile*> ?


And if you are really aggressive, since these
involve m3cg changes, move any typeids
that mangle names to be immediately after
the name they mangle??
Maybe that is overkill though.
It's just for tracing of mangled names via infrastructure.


Also, load/store should take some notion of a field.
A name or a uid perhaps.


But now I'm being greedy.
I'm willing to first prototype field access via
linear search.


Mika I see your point. We could consider both.
I can maybe do the library soon. Or someone else can. It's pretty easy.
  It really should be mostly #ifdefed C, plus either


   1) using UNTRACED ROOT or INTEGER instead of REFANY for the data
     You could loophole between untraced root and integer.
      Safety would imply providing integer and float/real options to safe code.
      Possibly double/longreal (which would take two slots on 32bit targets).
      I think the functions could be exposed to safe code. They'd raise
     exceptions for invalid tls indeices


   2) some way to reveal the REFANYs to the garbage collector
     2a) possibly by stashing them all in globals as well   
     2b) or by having the garbage collector use the per thread data functions,
  with the caveat that they can't be use across threads, which might
   be a problem; a thread could perhaps fetch all of its per thread
   data into locals before acknowleding suspension
    Many approaches would require the runtime to 
     know all the per thread data keys/indices, which I suspect is easy enough.


I'd appreciate a fairly firm commit it will actually be used.
Thanks.


Maybe we can switch m3core to use itself. Maybe.
  (It'd have to not raise exceptions or use try, to avoid circular dependency.)


Maybe just use an array and linear search, in m3core or your code..


 - Jay


----------------------------------------
> To: jay.krell at cornell.edu
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] Per thread data
> Date: Tue, 14 Sep 2010 13:28:13 -0700
> From: mika at async.async.caltech.edu
>
> Something that changes the meaning of the language in such a fundamental
> way I think ought to be implemented as a library so that we can simulate
> the behavior in other releases... not as a compiler pragma. Yes I know
> there's an efficiency and expressibility price for this.
>
> Mika
>
> Jay K writes:
> >
> >Sure. <* ThreadLocal *> or such=2C be sure it mentions "thread".
> >
> >=A0- Jay
> >
> >
> >
> >________________________________
> >> Subject: Re: [M3devel] Per thread data
> >> From: hosking at cs.purdue.edu
> >> Date: Tue=2C 14 Sep 2010 12:48:22 -0400
> >> CC: darko at darko.org=3B m3devel at elegosoft.com
> >> To: jay.krell at cornell.edu
> >>
> >> Ah=2C sorry. I misunderstood. Sounds like we need a thread-local pragma
> >> <*LOCAL*>?
> >>
> >> On 14 Sep 2010=2C at 08:59=2C Jay K wrote:
> >>
> >> Tony -- then why does pthread_get/setspecific and Win32 TLS exist?
> >> What language doesn't support heap allocation were they designed to suppo=
> >rt?
> >> It is because code often fails to pass all the parameters through all
> >> functions.
> >>
> >> Again the best current answer is:
> >> #ifdefed C that uses pthread_get/setspecific / Win32
> >> TlsAlloc/GetValue/SetValue=2C ignoring user threads/OpenBSD.
> >>
> >> As well=2C you'd get very very far with merely:
> >> #ifdef _WIN32
> >> __declspec(thread)
> >> #else
> >> __thread
> >> #endif
> >>
> >> Those work adequately for many many purposes=2C are more efficient=2C muc=
> >h
> >> more convenient=2C and very portable.
> >> I believe there is even an "official" C++ proposal along these lines.
> >>
> >> We could easily abstract this -- the first -- into Modula-3 and then
> >> support it on user threads as well.
> >> Can anyone propose something?
> >> It has to go in m3core=2C as that is the only code that (is supposed to)
> >> know which thread implementation is in use.
> >>
> >> - Jay
> >>
> >>
> >> > From: darko at darko.org
> >> > Date: Tue=2C 14 Sep 2010 05:34:59 -0700
> >> > To: hosking at cs.purdue.edu
> >> > CC: m3devel at elegosoft.com
> >> > Subject: Re: [M3devel] Per thread data
> >> >
> >> > That's the idea but each object can only call another object
> >> allocated for the same thread=2C so it needs to find the currently
> >> running thread's copy of the desired object.
> >> >
> >> > On 14/09/2010=2C at 5:08 AM=2C Tony Hosking wrote:
> >> >
> >> > > If they are truly private to each thread=2C then allocating them in
> >> the heap while still not locking them would be adequate. Why not?
> >> > >
> >> > > On 14 Sep 2010=2C at 01:08=2C Darko wrote:
> >> > >
> >> > >> I have lots of objects that are implemented on the basis that no
> >> calls on them can be re-entered=2C which also avoids the need for locking
> >> them in a threaded environment=2C which is impractical. The result is
> >> that I need one copy of each object in each thread. There is
> >> approximately one allocated object per object type so space is not a
> >> big issue. I'm looking at a small number of threads=2C probably maximum
> >> two per processor core. With modern processors I'm assuming that a
> >> linear search through a small array is actually quicker that a hash
> >> table.
> >> > >>
> >> > >> On 13/09/2010=2C at 9:55 PM=2C Mika Nystrom wrote:
> >> > >>
> >> > >>> Darko writes:
> >> > >>>> I need to have certain data structures allocated on a per thread
> >> basis. =3D
> >> > >>>> Right now I'm thinking of using the thread id from ThreadF.MyId() =
> >to =3D
> >> > >>>> index a list. Is there a better=2C more portable way of allocating
> >> on a =3D
> >> > >>>> per-thread basis?
> >> > >>>>
> >> > >>>> Cheers=2C
> >> > >>>> Darko.
> >> > >>>
> >> > >>> In my experience what you suggest works just fine (remember to lock=
> > the
> >> > >>> doors=2C though!) But you can get disappointing performance on some
> >> thread
> >> > >>> implementations (ones that involve switching into supervisor mode m=
> >ore
> >> > >>> than necessary when accessing pthread structures).
> >> > >>>
> >> > >>> Generally speaking I avoid needing per-thread structures as much
> >> as possible
> >> > >>> and instead put what you need in the Closure and then pass
> >> pointers around.
> >> > >>> Of course you can mix the methods for a compromise between speed an=
> >d
> >> > >>> cluttered code...
> >> > >>>
> >> > >>> I think what you want is also not a list but a Table.
> >> > >>>
> >> > >>> Mika
> >> > >>
> >> > >
> >> >
> >>
> > =
 		 	   		  


More information about the M3devel mailing list