[M3devel] cvsup

Jay K jay.krell at cornell.edu
Thu Mar 18 16:45:11 CET 2010


To repeat myself:

 

 >  and basically *any* library that uses pthreads is obligated to use
 >  pthread_atfork, be it a C library or the Modula-3 library, etc..


 

This is the crux of the matter.

We are being "bad pthread citizens" by not using pthread_atfork.
Granted, we are probably in large company.

 


There is a smaller fix and a larger fix, but it seems very clear we

should take one of them.

 

 

The small fix is:

 common/Thread.i3: add PThreadAtFork

  It does nothing on posix/nt.

  It calls pthread_atfork on pthread.

 

 

In RTCollector.m3 give a child handler that stores FALSE in the

three booleans that indicate the three threads have started.

No prepare or parent handler I believe.

 Though I should check for global locks there.

  Probably the locks in ThreadPThread suffice?

 

 

In ThreadPThread.m3 provide three handlers:

  prepare: lock "everything", at least the 5 static locks.

    I might try walking all the threads and locking them too.

  parent: unlock everything

  child: unlock everything and reinitialize the globals

 

 

Note that the atfork handlers run in many more programs than cvsup.

They would be used also with fork + exec.

Perhaps a way to disable that -- no way in Posix, we could just

provide a boolean to ourselves.

 

 

The larger fix would be to provide a common/Thread.i3 function

to call fork() *and* recreate all Modula-3 threads in the child process.

That is what I sent out.

 

 

In the second case, you also then need to allow that the caller

may or may not use that function, so you still need atfork handlers,

but they have to be slightly different, as the diffs I sent show.

 

 

I believe strongly this is the way to go.

It is how one is a "good pthread citizen".

Now, granted, I don't think most libraries are.

Witness the caveat about fork() in the Darwin manpage and

I think even the Posix spec.

But pthread_atfork is meant to solve this.

 

 

I don't suggest a full audit and add atfork handlers everywhere.

But m3core we can at least do, and that should satisfy cvsup.

Should check libm3 too.

 

 

Plus, as long as each module takes reponsibility for itself,

it shouldn't be so hard.

That is, I don't think the case is all that strong for providing the "forkall"

function that is in the diffs I sent.

 

 

I'll do more testing and send out diffs "later".

It could be as long as a week, I have a bunch of things to do.

Or maybe just a day. :)

 

 

I tested a form of this fix + cvsup + user threads and it appears

that cvsup can do one small thing and thereby work either way,

without knowledge as to the way.

 

 

That is, it can shutdown the dispatcher "directly", instead of queuing to it.

I'm not even sure the dispatcher thread is really needed.

 

 

As I said, I don't believe the application can handle this all itself.

Any library that uses pthreads is obligated to play into the general

mechanism. Generally the internals are not exposed for the application

to do it.

 

Treating Modula-3 as a closed system in which nobody uses

anything outside of or "below" m3core/libm3 I don't think is practical.

 

 

"applications", arguably defines as "processes", are often composed

of fairly disparate bodies of code that don't necessarily expose much

to each other. For example, they might each create their own worker

threads and not expose them.

This is what pthread_atfork is for.

Actually Tony, you gave me the lead here. :)

 

 

There is a problem that m3posixthreads (user) and pthreads semantics

diverage, and..I haven't been able to think of an way to fix that.

 

 

Well, it is actually easy to make "forkall" the default and only behavior actually,

on user and pthreads (but not NT).

 

 

But I don't see a way to make "fork1" the behavior for user threads.

You can associate a pid with each thread, and notice when the pid changes,

but I don't know which one thread you would keep, and you wouldn't

necessarily be able to register pthread_atfork handlers, unless maybe

user threads were used only on platforms that actually have pthreads..

 

 

And still there's no good way to it on NT I expect.

 

 

 - Jay
 


From: jay.krell at cornell.edu
To: dragisha at m3w.org
Date: Thu, 18 Mar 2010 10:09:38 +0000
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] cvsup



Dragisha, Really? The server? With the -C flag and/or -f?
 
Evidence is very very very very very good as to what is going on here.
  It is all related to "fork1" vs. "forkall".
  fork1 is the overwhelming usual behavior (Solaris has an option),
  and basically *any* library that uses pthreads is obligated to use
   pthread_atfork, be it a C library or the Modula-3 library, etc..
 
   The application cannot do things, unless
   the library exposes its internal locks. The Posix spec for
   pthread_atfork explains the problem.
 
  Consider C code that links in Modula-3 code.
  C code is fairly free to use the fork + do work model. It isn't very common,
   but it does exist, and people have gone to extra length to keep it viable.
   bash and ccache I believe both use this model.
 
 
Granted, if you had your own kernel thread implementation, it might
have addressed this.
 
 
I have a version now that has served over 1000 requests, similar to what I sent, but a bit reduced. The main change was I just called pthread_mutex_lock/unlock over the locks in ThreadPThread.m3, instead of using SuspendOthers. Haven't tested on Solaris/Darwin yet, just Linux. This version also doesn't expose the ForkProcessAndAllThreads functions.
The client has to restart its threads, or in the cvsupd case, don't depend on the dispatcher thread to survive fork.
I want to still try out the "bigger" change, but with the simpler lock/unlock.
And test on Solaris and Darwin.
 
 
I think for SuspendOthers to not work is not surprising.
The threads can still hold a few locks even if suspended, I'm pretty sure.
The point is to fork in a "controlled" fashion -- all locks held, and in
the right order, so that both parent and child can release them.
 
 
Taking "all" the locks in Prepare is more reliable.
 
 
I do wonder if really "all" locks need to be taken.
I only take the static ones declared in PThreadThread.c.
 
 - Jay

 
> Subject: Re: [M3devel] cvsup
> From: dragisha at m3w.org
> To: jay.krell at cornell.edu
> CC: hosking at cs.purdue.edu; m3devel at elegosoft.com
> Date: Wed, 17 Mar 2010 13:29:43 +0100
> 
> Yes, I can. I am building it regularly, from version to version, at
> least few times a year. And it also worked with my ancient kernel thread
> implementation. Was one of stress tests.
> 
> On Tue, 2010-03-16 at 16:10 +0000, Jay K wrote:
> > 
> > (Can anyone claim to have seen cvsup server work with kernel threads?)
> -- 
> Dragiša Durić <dragisha at m3w.org>
> 
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100318/df5f00a8/attachment-0002.html>


More information about the M3devel mailing list