[M3devel] per thread data?

Tony Hosking hosking at cs.purdue.edu
Tue Mar 31 00:44:02 CEST 2009


I am a little uneasy about calloc being used instead of RTAllocator.   
We lose the single point of allocation that is useful for all sorts of  
things like accounting, etc.  I'll take a look at what you have done  
and think about it some...

On 31 Mar 2009, at 09:15, Jay wrote:

>
> hm, thinking about this more...
> What about threads not created by Modula-3 Fork() (or the first  
> thread)?
>
>
> It looks like exception handling had a chance of working on them
> before. Now they'll crash upon entering functions
> with try or raise or I presume lock.
>
>
> 1) ok?
>
>
> 2) do the heap alloc on demand?
>  But is that enough? Can it be initialized without further context?
>  Let's see..the circular list can be maintained without further  
> context.
>  handle := pthread_self, ok. stack can probably be figured out, though
>  that is probably just for gc and could be left alone for now,  
> continuing
>  to not work (or fixed)...getcontext at least on some platforms can
>  fill this in, or VirtualQuery/msomething (mmap family?)?
>
>
> 3) put back the second thread local?
>
>
> #2 has a chance of working better than before -- letting GC
> work on threads not created by Modula-3 runtime, something
> that has long bothered me...but I haven't done a complete analysis.
> Or at least maybe keep it working as it was
> For now there is somewhat of a regression, ie, when calling
> Modula-3 code on threads not created from Modula-3.
> Possibly the gc in this case was already dangerous?
> Failing to find references on other stacks?
> Or failing all allocations (should be easy to check but I have to  
> run..)
>
>
> - Jay
>
>
>
>
>
>
>
>
>
>
>
> ----------------------------------------
>> From: jay.krell at cornell.edu
>> To: hosking at cs.purdue.edu
>> CC: m3devel at elegosoft.com
>> Subject: RE: [M3devel] per thread data?
>> Date: Mon, 30 Mar 2009 13:23:10 +0000
>>
>>
>> This was surprisingly difficult.
>>
>>
>> InitHandlers is called much earlier than InitActivations.
>> InitActivations does a heap allocation.
>> InitHandlers did not.
>> The types involved are not yet initialized at this point, or  
>> somesuch.
>> You cannot NEW(Activation) in the first call to PushFrame.
>> So, maybe, use a global for the first one,
>> but then what happens is it gets reinitialized later by
>> the module initializer -- which is perhaps another indictment
>> of initializers..or maybe a special case in the depths of the  
>> system --
>> this module and anything it uses are subject to be called by
>> compiler-generated calls -- they can be called before their  
>> initializers
>> run.. seems to me the initialization could have happened "statically"
>> like in C.
>>
>>
>> Anyway, I should have this done shortly.
>> Trick is to use a local value and assign it to a heap block
>> allocated directly with calloc instead of RTAllocator.
>>
>>
>> The result is maybe faster, maybe slower.
>> Before, "try" cost pthread_getspecific and setspecific.
>> Now it will just cost getspecific.
>> But with another pointer deref and call to GetActivation
>> with its on-demand initialization.
>>
>>
>> Before, popframe only called setspecific.
>> Now it will only call getspecific, plus the indirect
>> and on-demand initialization.
>> The on-demand seems bogus in pop, given that push already had to  
>> occur.
>> So maybe that could be optimized.
>>
>>
>> This stuff is highly optimized in C and C++ on NT..
>> NT/x86 has a special thread local just for exception handling,
>> faster than all other thread locals.
>> All non-x86 NT platforms have stack walkers -- no cost for "try",
>> and then "throw" maps instruction pointer to data about how to
>> to unwind the stack, using a little mini-assembly code.
>>
>>
>> - Jay
>>
>>
>> ________________________________
>>> From: jay.krell at cornell.edu
>>> To: hosking at cs.purdue.edu
>>> Date: Thu, 19 Mar 2009 01:03:57 +0000
>>> CC: m3devel at elegosoft.com
>>> Subject: Re: [M3devel] per thread data?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks, I should get around to that "soon" then.
>>>
>>>
>>>
>>> - Jay
>>>
>>>
>>>
>>> ________________________________
>>>
>>> From: hosking at cs.purdue.edu
>>> To: jay.krell at cornell.edu
>>> Date: Thu, 19 Mar 2009 10:14:59 +1100
>>> CC: m3devel at elegosoft.com
>>> Subject: Re: [M3devel] per thread data?
>>>
>>> I have no problem putting the exception handler stack thread local  
>>> into the activation thread local.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 18 Mar 2009, at 20:11, Jay wrote:
>>>
>>>
>>>
>>> I'm not looking at it right now, but doesn't seem rather piggy to  
>>> have two thread locals and data on the side?
>>>
>>>
>>> I'm guessing the data on the side is needed because we need to be  
>>> able to enumerate our threads, to suspend them all?
>>>
>>>
>>> I understand that having multiple thread locals optimizes their  
>>> use, but it seems greedy.
>>> vs. a small heap allocation that combines them.
>>>
>>> Or in fact.. presumably there could just be one thread local that  
>>> is the thread pointer, and the handler link could be put at the  
>>> start, for architectures where zero offset is smaller/faster than  
>>> non-zero offset.
>>>
>>>
>>> Another idea, of course, is to look into "__thread",  
>>> "__declspec(thread)".
>>>
>>> On Windows and probably all platforms they exist on, they are  
>>> nicely more efficient than pthread_get/setspecific, except on  
>>> Windows they don't really work acceptably prior to Vista -- they  
>>> only work in .exes and their static dependencies, not any .dll you  
>>> load after the process starts with LoadLibrary (dlopen).
>>>
>>>
>>> Does "__thread" work well on most non-Windows platforms?
>>> i.e. even if shared object is loaded with dlopen?
>>>
>>>
>>> I could have sworn I saw code out there that was "adaptive".
>>> It easily/efficiently checked if it was loaded with LoadLibrary or  
>>> not.
>>> If so, it'd TlsGet/SetValue (pthread_get/setspecific).
>>> If not, it'd use __declspec(thread) (__thread).
>>> The check was based on if __tlsindex was not zero or somesuch. I  
>>> couldn't track it down though.
>>>
>>>
>>> In either case, yes, I know, one of the thread locals at least is  
>>> gone on platforms that have stack walkers, e.g. Solaris, and  
>>> potentially NT, and maybe others.
>>>
>>>
>>> - Jay
>>>
>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20090331/46af3445/attachment-0002.html>


More information about the M3devel mailing list