[M3devel] per thread data?
Jay
jay.krell at cornell.edu
Tue Mar 31 02:06:04 CEST 2009
My point about optimized thread locals was about C and C++.
(And not gcc, at that.)
Modula-3 on NT is the same as "all" platforms, except, I guess, Solaris/SPARC32.
No stack walker.
Highly inefficient pushframe/popframe using general thread locals -- pthread_getspecific/setspecific / TlsGetValue/SetValue -- pthreads and Win32 are very analogous here.
There is only NT/x86 for Modula-3 so far.
I'm not sure gcc for NT/amd64 is mature enough, and haven't seen any signs of NT/IA64 (nor Alpha/PPC/MIPS...).
gcc on NT/x86 has its own other two EH mechanisms -- setjmp/longjmp and I presume a stack walking implementation.
Most other compilers are like Visual C++ -- e.g. OpenWatcom and DigitalMars.
CodeWarrior had two settings, I think one matched Visual C++.
There should really be just one implementation of this across all languages.
NT's setjmp/longjmp do interoperate with exceptions at least, so you can use a portable slow form and still interoperate..well, there are two versions, one that does, one that doesn't. I need to switch Modula-3 to the interoperable form.
- Jay
> CC: m3devel at elegosoft.com
> From: hosking at cs.purdue.edu
> To: jay.krell at cornell.edu
> Subject: Re: [M3devel] per thread data?
> Date: Tue, 31 Mar 2009 09:42:06 +1100
>
> Yes, this is a tricky issue. At some point I seem to recall it being
> OK to have non-Modula-3 threads start running Modula-3 code, but I
> don't know for sure. I've never really liked the idea of having non-
> M3 threads.
>
> Are you using the existing handler maps and exception stack unwinding
> support for non-x86 NT?
>
> On 31 Mar 2009, at 09:15, Jay wrote:
>
> >
> > hm, thinking about this more...
> > What about threads not created by Modula-3 Fork() (or the first
> > thread)?
> >
> > It looks like exception handling had a chance of working on them
> > before. Now they'll crash upon entering functions
> > with try or raise or I presume lock.
> >
> >
> > 1) ok?
> >
> >
> > 2) do the heap alloc on demand?
> > But is that enough? Can it be initialized without further context?
> > Let's see..the circular list can be maintained without further
> > context.
> > handle := pthread_self, ok. stack can probably be figured out, though
> > that is probably just for gc and could be left alone for now,
> > continuing
> > to not work (or fixed)...getcontext at least on some platforms can
> > fill this in, or VirtualQuery/msomething (mmap family?)?
> >
> >
> > 3) put back the second thread local?
> >
> >
> > #2 has a chance of working better than before -- letting GC
> > work on threads not created by Modula-3 runtime, something
> > that has long bothered me...but I haven't done a complete analysis.
> > Or at least maybe keep it working as it was
> > For now there is somewhat of a regression, ie, when calling
> > Modula-3 code on threads not created from Modula-3.
> > Possibly the gc in this case was already dangerous?
> > Failing to find references on other stacks?
> > Or failing all allocations (should be easy to check but I have to
> > run..)
> >
> >
> > - Jay
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > ----------------------------------------
> >> From: jay.krell at cornell.edu
> >> To: hosking at cs.purdue.edu
> >> CC: m3devel at elegosoft.com
> >> Subject: RE: [M3devel] per thread data?
> >> Date: Mon, 30 Mar 2009 13:23:10 +0000
> >>
> >>
> >> This was surprisingly difficult.
> >>
> >>
> >> InitHandlers is called much earlier than InitActivations.
> >> InitActivations does a heap allocation.
> >> InitHandlers did not.
> >> The types involved are not yet initialized at this point, or
> >> somesuch.
> >> You cannot NEW(Activation) in the first call to PushFrame.
> >> So, maybe, use a global for the first one,
> >> but then what happens is it gets reinitialized later by
> >> the module initializer -- which is perhaps another indictment
> >> of initializers..or maybe a special case in the depths of the
> >> system --
> >> this module and anything it uses are subject to be called by
> >> compiler-generated calls -- they can be called before their
> >> initializers
> >> run.. seems to me the initialization could have happened "statically"
> >> like in C.
> >>
> >>
> >> Anyway, I should have this done shortly.
> >> Trick is to use a local value and assign it to a heap block
> >> allocated directly with calloc instead of RTAllocator.
> >>
> >>
> >> The result is maybe faster, maybe slower.
> >> Before, "try" cost pthread_getspecific and setspecific.
> >> Now it will just cost getspecific.
> >> But with another pointer deref and call to GetActivation
> >> with its on-demand initialization.
> >>
> >>
> >> Before, popframe only called setspecific.
> >> Now it will only call getspecific, plus the indirect
> >> and on-demand initialization.
> >> The on-demand seems bogus in pop, given that push already had to
> >> occur.
> >> So maybe that could be optimized.
> >>
> >>
> >> This stuff is highly optimized in C and C++ on NT..
> >> NT/x86 has a special thread local just for exception handling,
> >> faster than all other thread locals.
> >> All non-x86 NT platforms have stack walkers -- no cost for "try",
> >> and then "throw" maps instruction pointer to data about how to
> >> to unwind the stack, using a little mini-assembly code.
> >>
> >>
> >> - Jay
> >>
> >>
> >> ________________________________
> >>> From: jay.krell at cornell.edu
> >>> To: hosking at cs.purdue.edu
> >>> Date: Thu, 19 Mar 2009 01:03:57 +0000
> >>> CC: m3devel at elegosoft.com
> >>> Subject: Re: [M3devel] per thread data?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Thanks, I should get around to that "soon" then.
> >>>
> >>>
> >>>
> >>> - Jay
> >>>
> >>>
> >>>
> >>> ________________________________
> >>>
> >>> From: hosking at cs.purdue.edu
> >>> To: jay.krell at cornell.edu
> >>> Date: Thu, 19 Mar 2009 10:14:59 +1100
> >>> CC: m3devel at elegosoft.com
> >>> Subject: Re: [M3devel] per thread data?
> >>>
> >>> I have no problem putting the exception handler stack thread local
> >>> into the activation thread local.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 18 Mar 2009, at 20:11, Jay wrote:
> >>>
> >>>
> >>>
> >>> I'm not looking at it right now, but doesn't seem rather piggy to
> >>> have two thread locals and data on the side?
> >>>
> >>>
> >>> I'm guessing the data on the side is needed because we need to be
> >>> able to enumerate our threads, to suspend them all?
> >>>
> >>>
> >>> I understand that having multiple thread locals optimizes their
> >>> use, but it seems greedy.
> >>> vs. a small heap allocation that combines them.
> >>>
> >>> Or in fact.. presumably there could just be one thread local that
> >>> is the thread pointer, and the handler link could be put at the
> >>> start, for architectures where zero offset is smaller/faster than
> >>> non-zero offset.
> >>>
> >>>
> >>> Another idea, of course, is to look into "__thread",
> >>> "__declspec(thread)".
> >>>
> >>> On Windows and probably all platforms they exist on, they are
> >>> nicely more efficient than pthread_get/setspecific, except on
> >>> Windows they don't really work acceptably prior to Vista -- they
> >>> only work in .exes and their static dependencies, not any .dll you
> >>> load after the process starts with LoadLibrary (dlopen).
> >>>
> >>>
> >>> Does "__thread" work well on most non-Windows platforms?
> >>> i.e. even if shared object is loaded with dlopen?
> >>>
> >>>
> >>> I could have sworn I saw code out there that was "adaptive".
> >>> It easily/efficiently checked if it was loaded with LoadLibrary or
> >>> not.
> >>> If so, it'd TlsGet/SetValue (pthread_get/setspecific).
> >>> If not, it'd use __declspec(thread) (__thread).
> >>> The check was based on if __tlsindex was not zero or somesuch. I
> >>> couldn't track it down though.
> >>>
> >>>
> >>> In either case, yes, I know, one of the thread locals at least is
> >>> gone on platforms that have stack walkers, e.g. Solaris, and
> >>> potentially NT, and maybe others.
> >>>
> >>>
> >>> - Jay
> >>>
> >>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20090331/fe21acfe/attachment-0002.html>
More information about the M3devel
mailing list