[M3devel] user threads

Mika Nystrom mika at async.caltech.edu
Wed Apr 29 20:46:03 CEST 2009


Tony, no of course it's not surprising.  -unsafe removes the
lock-on-every update (in the global environment---pushed environments
are always "unsafe").  In fact since there are hardly any updates
in the global environment, it's locking on reading that's hurting.

[Tony, I'm curious how you'd go about implementing the sort of
"non-blocking lock" you mention.]

But what Jay has pointed out is that what I labeled as "kernel
threads" in the attached table aren't kernel threads at all but
libc_r threads, which provide the same facilities (as far as I know)
as M3's "user threads".  In this case M3's user threads are simply
superior to what the system provides.  Much superior.

I think if possible "M3 user threads" should definitely be the
default on FreeBSD4 and earlier, rather than "system user threads".

I think people who want to implement coroutines and occam-like
things with threads would also be happy if user threads were very
easy to enable.  Although I agree they should certainly not normally
be the default and it is probably unnecessary to be able to enable
them at runtime.  A compile/link-time option would be nice though....
Rather than having to recompile the whole M3 distribution, that is.

I know of two important classes of applications where user threads
are likely to be superior to kernel threads:

1. occam-like things, as described above.  Not uncommon in hardware
   circles!

2. applications that use RMI (e.g., Network Objects) to communicate
   between separate runtimes that are mapped exactly one runtime
   to each physical core.

For case 2, readers may be interested in the following report:

http://caltechcstr.library.caltech.edu/218/

     Mika

Tony Hosking writes:
>Mika, it is not surprising that your lock-on-every variable update  
>will cost a lot in any non-user-level threading scheme.  You should  
>consider using different mechanisms for this degree of locking in  
>Scheme (based on some of the non-blocking lock implementations for  
>Java perhaps).  I don't expect any implementation of locking for a  
>multi-core/-processor will ever perform as well as user-level threads.
>
>On 29 Apr 2009, at 16:53, Mika Nystrom wrote:
>
>> Ok, it works!
>>
>> Numbers:
>>
>> Timings in milliseconds, three samples, filesystem "warmed up" by
>> doing one dummy run before launching the real ones.
>>
>> -unsafe means that I use non-locking Scheme environments, otherwise
>> they lock for every variable update.
>>                                                             ave
>> CM3 last week, kernel threads, -unsafe   1460  1482  1437   1460
>> CM3 last week, kernel threads,           2392  2402  2376   2390
>> CM3 this week, kernel threads, -unsafe   1455  1458  1490   1468 (*)
>> CM3 this week, user threads,   -unsafe    914   934   914    921
>> CM3 this week, user threads,              967   965   986    973
>> PM3                            -unsafe    678   657   682    672
>> PM3                                       709   714   700    708
>>
>> (*) not entirely sure this got linked correctly.
>>
>>    Mika
>>
>>
>> Jay writes:
>>>
>>> User threads seem to work on on FreeBSD/x86 7.0.
>>> Mika can you report back the perf cm3 vs. pm3?
>>> Still, kernel threads have been around a long time and imho should  
>>> be strongly favored..
>>>
>>>
>>> Kernel threads should be a /little/ faster than they were --  
>>> PushEFrame removed from successful heap allocations. And should be  
>>> further improvable via __thread where it is supported -- probably  
>>> not FreeBSD 4.
>>> x, sometimes older is not better. :)
>>>
>>>
>>> I've temporarily switched FreeBSD/x86 to userthreads by default but  
>>> I think that's just an experiment and should be undone shortly,  
>>> maybe work out some other story for easily switching between them,  
>>> or just k
>>> eep the existing story of "you get to rebuild everything".
>>>
>>>
>>> Tony, can you look into GetGCRatio? I removed the call to it. The  
>>> "fatal" pragma invokes PushEFrame apparently.
>>>
>>>
>>> We should now "fix" Win32 and pthreads to not have GetActivation  
>>> initialize on-demand, just leave Init to initialize always. This  
>>> should shave a few more cycles from PushEFrame.
>>>
>>>
>>> - Jay



More information about the M3devel mailing list