[M3devel] fork/cvsup

Jay K jay.krell at cornell.edu
Wed Mar 17 19:13:45 CET 2010


---

bad news:

It doesn't completely work. It works a bunch of times in a row, like 9, then hangs.

Restart manually. Works again. Around 9 times. Then hangs again.

That is on Linux/x86 and Solaris/sparc.

Doesn't work at all on Mac/amd64, just hangs.

 

---

sketch:

m3core uses pthread_atfork to selectively reinitialize

  Mainly to only have one thread.

 

 

common Thread.PThreadAtFork is provided for others to do the same

  It is deliberately in a portable interface.

 

 

Thread.ReforkThreadAfterProcessFork

  Is provided for users to restart threads from their child AtFork hander.

  This is used by the allocator/collector.


 

Thread.ForkProcessAndAllThreads()

  Is used by "lazy" clients who want to restart all their threads

  but didn't keep track of them. The runtime can do it for them.

 

 

This allows for "fork + do work" folks do call or not call ForkProcessAndAllThreads

or not, depending on if they need their threads restarted.

The runtime takes care of its threads either way.

 

 

---

What'd I'd written up:

 

attached works typically 9 times on Linux and Solaris
before server hangs again.


 

No improvement on Darwin, just hangs.
Can't see much in debuggers for some reason.

 


There is extra allowance in the m3core change such
 that users of fork + do work (as opposed to fork + exec)
 may or may not call ForkAll, depending on if they
 feel a need for their own threads to be recreated,
 and if they've kept track of how to recreate them,
 or just rely on the runtime to know all the threads.

 


There are three runtime threads that are sometimes
created in the parent, and if so, recreated in the child.

background collector, foreground collector, weak ref thread

 

 

I'll try to poke at it some more.

 


I'm not sure what is the best way to suspend all threads.
I tried a few differnt ways.
  SuspendOthers
  LockHeap
  pthread_mutex_lock
  various combinations


 

It is deliberate that pthread specific code is in common/Thread.i3.
That way code can be portable, at least among the two Posix thread implementations.

 


 - Jay


 


From: hosking at cs.purdue.edu
Date: Wed, 17 Mar 2010 14:01:31 -0400
To: jay.krell at cornell.edu
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] fork/cvsup

Can you sketch the approach you've taken?




On 17 Mar 2010, at 11:39, Jay K wrote:

I have something working on Solaris now.
More details after testing on Linux and Darwin.
 
 - Jay
 


From: jay.krell at cornell.edu
To: hosking at cs.purdue.edu
Date: Wed, 17 Mar 2010 14:01:15 +0000
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] fork/cvsup

Exec what?
You'd have to change the code to carefully reach the same place.
 
 - Jay
 


Subject: Re: [M3devel] fork/cvsup
From: hosking at cs.purdue.edu
Date: Wed, 17 Mar 2010 09:28:14 -0400
CC: m3devel at elegosoft.com
To: jay.krell at cornell.edu




Why not just exec in the child?


On 17 Mar 2010, at 03:47, Jay K wrote:

http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPages/man2/fork.2.html
 
 
There are limits to what you can do in the child process.  To be totally safe you should restrict your-self yourself
     self to only executing async-signal safe operations until such time as one of the exec functions is
     called.  All APIs, including global data symbols, in any framework or library should be assumed to be
     unsafe after a fork() unless explicitly documented to be safe or async-signal safe.  If you need to use
     these frameworks in the child process, you must exec.  In this situation it is reasonable to exec your-self. yourself.
     self.

 
http://www.opengroup.org/onlinepubs/000095399/functions/fork.html
 
Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of theexec functions is called. [THR]   Fork handlers may be established by means of the pthread_atfork() function in order to maintain application invariants across fork() calls. 
 
 
I've run through a few theories so far.
Current thinking is related to what Tony said:
 use pthread_atfork: 
   in prepare, stopworld 
   in parent, resumeworld 
   You don't want the child to be mid-gc for example, on another thread. Or mid-anything.
   in child, reinitialize -- current thread is the only thread
 
 
Also in the cvsup code, ShutDown should just call DoShutDown immediately.
I did that, without m3core changes, and it hits an error in the pthread code, signaling a nonexistant thread.
pthread_atfork/child should address that -- child shouldn't retain a record of all the threads in the parent.
 
 
I don't have a theory as to why user threads work.
 
 
I experimented with malloc vs. static alloc vs. sbrk vs. mmap(private) vs. mmap(shared).
I was expecting more cases to act like mmap(shared), but none did, only it.
 
 
I experimented with having mutexes and condition variables be initialize up front instead of on-demand.
Via changing cvsup to lock/unlock or broadcast immediately upon creating them.
On the theory that might let them work across process.
That didn't make a difference.
 
 
 - Jay


 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100317/917f8b6a/attachment-0002.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: m3core_atfork.txt
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100317/917f8b6a/attachment-0004.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cvsup_forkall.txt
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100317/917f8b6a/attachment-0005.txt>


More information about the M3devel mailing list