[M3devel] cvsup

Tony Hosking hosking at cs.purdue.edu
Thu Mar 18 22:33:30 CET 2010


First off, AtFork probably does not belong in the Thread interface.  That is part of the language definition and we shouldn't monkey with it.

I guess I don't understand the specs of what you are proposing.

I can't say that I can immediately imagine your diff.

On 18 Mar 2010, at 16:55, Jay K wrote:

>   > Name should probably be something more like Thread.AtFork. 
>  
> 
> I disagree.
> It should be PThreadAtFork, because it only does anything when
> using pthreads. If you are on a non-pthread system (user threads or Win32),
> I expect, I think, the function won't have any affect.
> Granted, we could be using user threads on a system that has pthreads,
> so there is ambiguity, there is ambiguity either way.
>  
> 
> Does it mean, hey, thin wrapper over pthread_atfork, and I'll call it
> even if using user threads? Or does it mean, only call it if
> using pthreads?
>  
>  
>  > I wonder if we should only fork when the collector is turned off?
>  
> 
> I don't quite follow.
> You mean, anyone calling fork() must GC.Disable()?
> Or LockHeap()
> Or, take all the thread locks?
> That last point is part of what I'm saying, since the "prepare" callback
> would do that.
> 
>  
> pthread_atfork lets you establish preconditions for fork() "distributed"
> out around the code, including with access to internals.
> Without having to "coordinate" with the caller of fork().
>  
> 
>  > fork + exec is probably relatively safe already.
>  
> 
> Right, but this change "unnecessarily" alters it.
> The "prepare" callback would still run.
> You can't discern fork + noexec vs. fork + exec
> They start the same.
>  
>  
> I think we are converging on agreement.
> Let me do some more tuning (stylistic, diff reduction, not perf) and testing.
> Diffs later, maybe tonight, maybe in a week, depending on life.
>  
>  
>  > Perhaps a way to disable that -- no way in Posix, we could just
>  > provide a boolean to ourselves.
>  > Not sure I follow.
>  
> 
> I meant, something like, in m3core/libm3 where we have a fork+exec
> sequence, set a boolean somewhere to tell all of our "prepare" handlers
> not to waste their time doing anything. fork + exec need not
> run the prepare handlers.
> And I worry that all their lock taking might deadlock..but I suspect
> that would indicate a bug. ?
>  
> 
>  > recreate the threads
>  > That's tough to do portably!
> 
>  
> Easy with user threads -- automatically happens.
> Easy with pthreads -- systems with fork().
> Hard/impossible on Win32.
>  
> 
>  > Really we only need to be concerned with internal threads
>  > to the run-time, right?  It's up to apps to restart threads
>  > in the children as necessary.
>  
> 
> I can agree with that.
> It is kind of just a lazy app that says "hey, please recreate
> all my threads, I didn't keep track of them, so I don't
> know what to restart, I know you kept track of all of them,
> so please do it for me".
>  
>  
> You should be able to "imagine" what my diff looks like.
> And maybe ok it without seeing it?
>  
>  
> Anyway, I definitely want to see Darwin not hang first.
>  
>  
>  - Jay
> 
> 
> 
> 
> 
> 
>  
> Subject: Re: [M3devel] cvsup
> From: hosking at cs.purdue.edu
> Date: Thu, 18 Mar 2010 13:16:03 -0400
> CC: dragisha at m3w.org; m3devel at elegosoft.com
> To: jay.krell at cornell.edu
> 
> On 18 Mar 2010, at 11:45, Jay K wrote:
> 
> To repeat myself:
>  
>  >  and basically *any* library that uses pthreads is obligated to use
>  >  pthread_atfork, be it a C library or the Modula-3 library, etc..
> 
>  
> This is the crux of the matter.
> We are being "bad pthread citizens" by not using pthread_atfork.
> Granted, we are probably in large company.
>  
> 
> There is a smaller fix and a larger fix, but it seems very clear we
> should take one of them.
>  
>  
> The small fix is:
>  common/Thread.i3: add PThreadAtFork
> 
> Name should probably be something more like Thread.AtFork.
> 
>   It does nothing on posix/nt.
>   It calls pthread_atfork on pthread.
>  
>  
> In RTCollector.m3 give a child handler that stores FALSE in the
> three booleans that indicate the three threads have started.
> No prepare or parent handler I believe.
>  Though I should check for global locks there.
>   Probably the locks in ThreadPThread suffice?
> 
> I wonder if we should only fork when the collector is turned off?
> 
>  In ThreadPThread.m3 provide three handlers:
>   prepare: lock "everything", at least the 5 static locks.
>     I might try walking all the threads and locking them too.
>   parent: unlock everything
>   child: unlock everything and reinitialize the globals
>  
>  
> Note that the atfork handlers run in many more programs than cvsup.
> They would be used also with fork + exec.
> 
> fork + exec is probably relatively safe already.
> 
> Perhaps a way to disable that -- no way in Posix, we could just
> provide a boolean to ourselves.
> 
> Not sure I follow.
> 
>  The larger fix would be to provide a common/Thread.i3 function
> to call fork() *and* recreate all Modula-3 threads in the child process.
> That is what I sent out.
> 
> That's tough to do portably!
> 
>  In the second case, you also then need to allow that the caller
> may or may not use that function, so you still need atfork handlers,
> but they have to be slightly different, as the diffs I sent show.
>  
>  
> I believe strongly this is the way to go.
> It is how one is a "good pthread citizen".
> Now, granted, I don't think most libraries are.
> Witness the caveat about fork() in the Darwin manpage and
> I think even the Posix spec.
> But pthread_atfork is meant to solve this.
>  
>  
> I don't suggest a full audit and add atfork handlers everywhere.
> But m3core we can at least do, and that should satisfy cvsup.
> Should check libm3 too.
> 
> Really we only need to be concerned with internal threads to the run-time, right?  It's up to apps to restart threads in the children as necessary.
> 
>  Plus, as long as each module takes reponsibility for itself,
> it shouldn't be so hard.
> That is, I don't think the case is all that strong for providing the "forkall"
> function that is in the diffs I sent.
> 
> Yes, I think that is overkill.
> 
> I'll do more testing and send out diffs "later".
> It could be as long as a week, I have a bunch of things to do.
> Or maybe just a day. :)
>  
>  
> I tested a form of this fix + cvsup + user threads and it appears
> that cvsup can do one small thing and thereby work either way,
> without knowledge as to the way.
>  
>  
> That is, it can shutdown the dispatcher "directly", instead of queuing to it.
> I'm not even sure the dispatcher thread is really needed.
>  
>  
> As I said, I don't believe the application can handle this all itself.
> Any library that uses pthreads is obligated to play into the general
> mechanism. Generally the internals are not exposed for the application
> to do it.
>  
> Treating Modula-3 as a closed system in which nobody uses
> anything outside of or "below" m3core/libm3 I don't think is practical.
>  
>  
> "applications", arguably defines as "processes", are often composed
> of fairly disparate bodies of code that don't necessarily expose much
> to each other. For example, they might each create their own worker
> threads and not expose them.
> This is what pthread_atfork is for.
> Actually Tony, you gave me the lead here. :)
> 
> ;-)
> 
>  
>  
> There is a problem that m3posixthreads (user) and pthreads semantics
> diverage, and..I haven't been able to think of an way to fix that.
>  
>  
> Well, it is actually easy to make "forkall" the default and only behavior actually,
> on user and pthreads (but not NT).
>  
>  
> But I don't see a way to make "fork1" the behavior for user threads.
> You can associate a pid with each thread, and notice when the pid changes,
> but I don't know which one thread you would keep, and you wouldn't
> necessarily be able to register pthread_atfork handlers, unless maybe
> user threads were used only on platforms that actually have pthreads..
>  
>  
> And still there's no good way to it on NT I expect.
>  
>  
>  - Jay
>  
> From: jay.krell at cornell.edu
> To: dragisha at m3w.org
> Date: Thu, 18 Mar 2010 10:09:38 +0000
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] cvsup
> 
> Dragisha, Really? The server? With the -C flag and/or -f?
>  
> Evidence is very very very very very good as to what is going on here.
>   It is all related to "fork1" vs. "forkall".
>   fork1 is the overwhelming usual behavior (Solaris has an option),
>   and basically *any* library that uses pthreads is obligated to use
>    pthread_atfork, be it a C library or the Modula-3 library, etc..
>  
>    The application cannot do things, unless
>    the library exposes its internal locks. The Posix spec for
>    pthread_atfork explains the problem.
>  
>   Consider C code that links in Modula-3 code.
>   C code is fairly free to use the fork + do work model. It isn't very common,
>    but it does exist, and people have gone to extra length to keep it viable.
>    bash and ccache I believe both use this model.
>  
>  
> Granted, if you had your own kernel thread implementation, it might
> have addressed this.
>  
>  
> I have a version now that has served over 1000 requests, similar to what I sent, but a bit reduced. The main change was I just called pthread_mutex_lock/unlock over the locks in ThreadPThread.m3, instead of using SuspendOthers. Haven't tested on Solaris/Darwin yet, just Linux. This version also doesn't expose the ForkProcessAndAllThreads functions.
> The client has to restart its threads, or in the cvsupd case, don't depend on the dispatcher thread to survive fork.
> I want to still try out the "bigger" change, but with the simpler lock/unlock.
> And test on Solaris and Darwin.
>  
>  
> I think for SuspendOthers to not work is not surprising.
> The threads can still hold a few locks even if suspended, I'm pretty sure.
> The point is to fork in a "controlled" fashion -- all locks held, and in
> the right order, so that both parent and child can release them.
>  
>  
> Taking "all" the locks in Prepare is more reliable.
>  
>  
> I do wonder if really "all" locks need to be taken.
> I only take the static ones declared in PThreadThread.c.
>  
>  - Jay
> 
>  
> > Subject: Re: [M3devel] cvsup
> > From: dragisha at m3w.org
> > To: jay.krell at cornell.edu
> > CC: hosking at cs.purdue.edu; m3devel at elegosoft.com
> > Date: Wed, 17 Mar 2010 13:29:43 +0100
> > 
> > Yes, I can. I am building it regularly, from version to version, at
> > least few times a year. And it also worked with my ancient kernel thread
> > implementation. Was one of stress tests.
> > 
> > On Tue, 2010-03-16 at 16:10 +0000, Jay K wrote:
> > > 
> > > (Can anyone claim to have seen cvsup server work with kernel threads?)
> > -- 
> > Dragiša Durić <dragisha at m3w.org>
> > 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100318/8931caae/attachment-0002.html>


More information about the M3devel mailing list