[M3devel] [M3commit] how to switch userthreads on/off

Wed Apr 29 20:33:01 CEST 2009

Mika,

With the current implementation of M3 MUTEX 1-1 as pthread mutex you  
are bound to have significant overhead for any locking code even in  
single-threaded apps.  We need to move towards a thin-lock  
implementation for mutex (as used in modern Java implementations) to  
avoid overhead for uncontended locks.  It's not too hard to  
implement.  The idea is to represent a mutex as a tagged word.  The  
word contains either NIL, the thread holding the lock, or a pointer to  
a full-blown (inflated) pthread mutex.  We can use GC and other  
opportunities to deflate locks as needs.  Checking the tag requires a  
CAS.  There are other techniques that further eliminate the CAS for  
the uncontended case.  But, generally, you should consider LOCK to be  
a fairly high-overhead operation for now.

Antony Hosking | Associate Professor | Computer Science | Purdue  
University
305 N. University Street | West Lafayette | IN 47907 | USA
Office +1 765 494 6001 | Mobile +1 765 427 5484

On 29 Apr 2009, at 16:22, Mika Nystrom wrote:

> Jay writes:
> ...
>>
>> Maybe just leave it as an option in m3core's m3makefile and people  
>> can twiddle it if they want and rebuild the entire system like it  
>> is today?
>> That is a bit onerous, but maybe it's all userthreads deserve?
>> ?
>>
>>
>> Anyone who actually wanted to switch back and forth (Mika) would  
>> just have two installs and two source trees?
>>
>>
>> - Jay
>
> I just want to clarify.  I'm not really that interested in switching
> back and forth.  I'm just a little disturbed by the sometimes huge
> performance loss due to the introduction of kernel threads.  I knew
> that this would happen in certain highly multithreaded applications,
> but I'm surprised it happens in a more or less single-threaded
> application.
>
> I think I've just been spoiled by 10 years of using SRCM3 and PM3
> for FreeBSD w/o kernel threads in the sense that I've learned that
> using LOCK has essentially no cost.  On a shared-memory  
> multiprocessor,
> I really don't expect that to remain the case... physics won't allow
> it.  So now I just have to go through my code and find all the places
> where I lock too much and remove them.
>
> But the memory allocator and garbage collector do it too, no?
>
> I also think that this idea of being able to use either is great.
> Mainly single-threaded programs should definitely not use kernel
> threads!
>
> As for reaching the "thread locals", there is one slightly crazy
> idea that one could borrow from Sussman and Steele: add another
> implicit argument to every Modula-3 routine.  In that argument,
> pass a pointer to the thread locals.  For EXTERNAL calls (in or
> out), make it NIL (somehow, maybe involving pragmas), and in that
> case (only), use the pthreads routines to access the thread locals.
> Ok so it sounds kind of nuts, but with this approach you could avoid
> locking or even calling into the pthreads libs almost entirely for
> a single-threaded program.  You could even have a thread-local
> memory allocator that would only lock when it needs to request
> memory from the "global allocator"...   in fact there are lots of
> things you can do with this sort of thing.  Dynamically scoped
> variables in Scheme (a la MacLisp?) is what they originally proposed
> it for but then they suggested all kinds of tricks related to
> continuations with it.
>
>    Mika

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20090430/f3999897/attachment-0002.html>