[M3devel] pthread in cm3...

Sat May 19 22:00:36 CEST 2007

I just checked in a patch that is closer to your code, and that lets  
me run 3200 threads on LINUXLIBC6 (I can't do 3500 because thread  
creation fails for that many).

On Apr 25, 2007, at 12:35 PM, Dragiša Durić wrote:

> After implementing that workaround for result code 11 in that  
> SIGSUSPEND
> loop, every time during 1st or at most second pass (of 4000) it  
> stucks.
> Not same place every time, though... I think there are RCs in
> LockHeap/SuspendAll logic.
>
> dd
>
> On Wed, 2007-04-25 at 09:51 -0400, Tony Hosking wrote:
>> Yes, good thinking.  Tuning the threads systems is a good plan all
>> round.
>>
>> On Apr 25, 2007, at 2:55 AM, Dragiša Durić wrote:
>>
>>> Just a random thought...
>>>
>>> I don't think my TestThreads is something special, but it's few  
>>> thread
>>> use patterns combined... And I've just had bright :) idea  
>>> yesterday -
>>> it's also decent benchmark for whole threading system... I think it
>>> would be nice to test it with 10000 rounds, 4000 threads each (once
>>> cm3
>>> cvs-head is fixed) with PTHREAD and without PTHREAD. I will do
>>> tests for
>>> mine.
>>>
>>> I think these extra data structures and global locks in PTHREAD are
>>> big
>>> efficiency killers. Benchmark will show.
>>>
>>> dd
>>>
>>> On Mon, 2007-04-23 at 08:40 -0400, Tony Hosking wrote:
>>>> I take all of your points seriously.  One option would be to offer
>>>> your threads implementation as another build option for CM3.  I'm
>>>> going to track down the bug I introduced recently and then we can
>>>> consider how to move forward.
>>>>
>>>> On Apr 23, 2007, at 4:21 AM, Dragiša Durić wrote:
>>>>
>>>>> On Sun, 2007-04-22 at 15:59 -0400, Tony Hosking wrote:
>>>>>> On Apr 22, 2007, at 10:47 AM, Dragiša Durić wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Have been skimming through source of PTHREAD code... And I think
>>>>>>> job can
>>>>>>> be done without so much relying on how-they-did-it-before, esp
>>>>>>> with
>>>>>>> regard to list of waiters and similar internal and global
>>>>>>> structures.
>>>>>>> Also, I see number of global locks and I am sure they are
>>>>>>> congestion
>>>>>>> generators every now and while, esp in heavy threading  
>>>>>>> situations.
>>>>>>>
>>>>>>> Of course, there is number of approaches to this multi-thread
>>>>>>> situations. Mine being one of very nonconservative use of
>>>>>>> threads, I
>>>>>>> think it is important to remain open to possibly very big
>>>>>>> number of
>>>>>>> threads running in single process - meaning scalability is  
>>>>>>> one of
>>>>>>> primary objections... As global locks don't do well with
>>>>>>> scalability, I
>>>>>>> don't like "cm" and similar global congestion points.
>>>>>>
>>>>>> Yes, there are tensions between a thin/absent veneer between
>>>>>> language
>>>>>> threads and system threads.  Most important are issues of
>>>>>> preserving
>>>>>> a reasonable memory model for programmers (see Hans Boehm's paper
>>>>>> http://portal.acm.org/citation.cfm?doid=1065010.1065042).
>>>>>
>>>>> I know that paper, and being Modula-3 camp, I am - by definition -
>>>>> agreed to no-library-approach-for-threads :).
>>>>>
>>>>>>  There are
>>>>>> also questions of portability and debuggability.
>>>>>
>>>>>   Of course. That's why I am using only POSIX defined features,  
>>>>> and
>>>>> when
>>>>> in doubt - ones used by Boehm in his famous GC :).
>>>>>
>>>>>> I agree that global
>>>>>> locks are to be avoided where they cause known contention, but
>>>>>> there
>>>>>> are tradeoffs there too.
>>>>>
>>>>>   Global lock is bad, whatever reasons.
>>>>>
>>>>>> For large numbers of threads (as you appear
>>>>>> to need) I think we would need to adopt some other implementation
>>>>>> approach, possibly by multiplexing multiple lightweight user- 
>>>>>> level
>>>>>> threads on some smaller number of heavyweight system-level  
>>>>>> threads,
>>>>>> but then you run into scheduling and load-balancing problems.
>>>>>
>>>>>   I've argued this before... With O(1) process scheduling  
>>>>> available
>>>>> for
>>>>> four years from Linux, and in that time surely from everyone
>>>>> else... It
>>>>> would be bigger problem to maintain scheduling for special "BIG"
>>>>> cases
>>>>> inside our support libraries than to rely on operating system.  
>>>>> It's
>>>>> good
>>>>> that mainstream OS people recognized threads as Need, and it is
>>>>> time now
>>>>> for us to accept they did it well - AT LAST.
>>>>>
>>>>>   And, very important... I can't see what is heavyweight on system
>>>>> which
>>>>> does 10,000 context switches per second for 1500 threads with  
>>>>> 2% CPU
>>>>> load. And all this in 2004 on some 1.something GHz CPU. Threads  
>>>>> WERE
>>>>> heavyweight four years ago, and they are not, long since. Even
>>>>> Windows
>>>>> has lightweight kernel-space threads :).
>>>>>
>>>>>>
>>>>>>> We talked about this at least once before, and I think I  
>>>>>>> remember
>>>>>>> you
>>>>>>> insisted on more compatibility than can be read from SPwM3.  
>>>>>>> Do you
>>>>>>> think
>>>>>>> best idea would be to integrate mine NPTL code into CM3 for
>>>>>>> people to
>>>>>>> trash and test, and let everyone select what is best for their
>>>>>>> situation?
>>>>>>
>>>>>> What I wanted was a situation where programs would be able to run
>>>>>> with the same tools (e.g., showheap, showthread) under both user-
>>>>>> level and system threading.  This goal has been achieved with the
>>>>>> current pthread-based implementation.
>>>>>
>>>>>   It is good reasoning, and it's one of reasons I did not suggest
>>>>> replacement... I think mine version is less bloated and I know  
>>>>> it's
>>>>> very
>>>>> good for long and stable process uptime we all expect from  
>>>>> Modula-3
>>>>> programs. But also, implied compatibilities outside of SPwM3 and
>>>>> direct
>>>>> demands from other parts of runtime were not on my list. These  
>>>>> I've
>>>>> respected, and it looks like these are good production time
>>>>> criteria. As
>>>>> opposed to excellent development time criteria you based yours on.
>>>>>
>>>>>> Moreover, I wanted something
>>>>>> where variations in thread support from one system to another  
>>>>>> could
>>>>>> be exploited most easily (such as for systems where thread  
>>>>>> suspend/
>>>>>> resume is provided as a primitive).  Again, the current
>>>>>> implementation achieves this, and runs with minimal target- 
>>>>>> specific
>>>>>> code on Darwin, Solaris, and Linux.  Ports to other targets
>>>>>> should be
>>>>>> relatively straightforward.
>>>>>
>>>>>   I've not ported mine outside of LINUXLIBC6, but as it's  
>>>>> extremmely
>>>>> POSIX, I see no problem.
>>>>>
>>>>>>
>>>>>>> Problems I had with my pthread implementation were all related
>>>>>>> to VM
>>>>>>> hell of earlier GC implementation... After you did that piece
>>>>>>> of art
>>>>>>> with new approach to GC, I expect infinite uptimes from my
>>>>>>> servers and
>>>>>>> bots :). Big thanks for that!
>>>>>>
>>>>>> Any native threading implementation is going to have problems  
>>>>>> with
>>>>>> VM-
>>>>>> based memory management.  I'm surprised to were able to get
>>>>>> anything
>>>>>> going with the VM-based GC.
>>>>>
>>>>>   Anything is pretty much - I have heavy multithreaded servers
>>>>> running
>>>>> literally for years,,, one of them is up since January of 2004, it
>>>>> services few hundreds of connected users (and up to 1500  
>>>>> threads) at
>>>>> almost every moment and breaks only when system reboots :). All  
>>>>> that
>>>>> with heavy integration of various C libraries.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> dd
>>>>>>> -- 
>>>>>>> Dragiša Durić <dragisha at m3w.org>
>>>>>>
>>>>> -- 
>>>>> Dragiša Durić <dragisha at m3w.org>
>>>>
>>> -- 
>>> Dragiša Durić <dragisha at m3w.org>
>>
> -- 
> Dragiša Durić <dragisha at m3w.org>