[M3devel] SIGCHLD diff

Mika Nystrom mika at async.caltech.edu
Sun Feb 13 23:08:04 CET 2011


Hi Tony,

I found a problem with the change.  I would appreciate it if you and
Jay could think about what I found and my solution (and anybody else
is welcome to as well, of course).

When fiddling with cm3 to get it to run in parallel, I pushed it up to 
fork up to 100 processes at once, just to see what happens.  When I did
that (with my SIGCHLD patch) I got a few errors such as illegal instruction
and a bus error or two.  I believe what was happening was that I was
taking too many signals at once, because of the following code in 
ThreadPosix.m3:

PROCEDURE switch_thread (sig: int) RAISES {Alerted} =
  BEGIN
    allow_sigvtalrm ();  (* here *)

    INC(inCritical);
    (* mark signal as being delivered *)
    IF sig >= 0 AND sig < MaxSigs THEN
      gotSigs := gotSigs + SET OF Sig { sig }
    END;
    DEC(inCritical);

    IF inCritical = 0 AND heapState.inCritical = 0 THEN
      InternalYield ()
    END;
  END switch_thread;

I found it a bit odd that the first action of the signal handler is to 
enable signals.  With VTALRM this is innocuous, but in my case, imagine
what happens if you have a lot of exited child processes queued up.
What happens is that your signal stack gets wound deeper and deeper, and
each time you catch one SIGCHLD you ask for another.  Eventually you
run out of stack space.

My change, which appears to fix this problem, is to split up the "allow"
function as follows:

PROCEDURE switch_thread (sig: int) RAISES {Alerted} =
  BEGIN
    allow_sigvtalrm ();                               (* VTALRM only *)

    INC(inCritical);
    IF inCritical = 1 THEN allow_othersigs ();  END;  (* SIGCHLD *)

    (* mark signal as being delivered *)
    IF sig >= 0 AND sig < MaxSigs THEN
      gotSigs := gotSigs + SET OF Sig { sig }
    END;
    DEC(inCritical);

    IF inCritical = 0 AND heapState.inCritical = 0 THEN
      InternalYield ()
    END;
  END switch_thread;

This way, SIGCHLD is not taken in nested signal handlers.  But is there a
problem here?  Well, the fallback is that it takes 0.1 seconds to wake
up on Process.Wait .

Is there a chance that SIGCHLD signals are left pending?  My argument is
no, since in the "base" environment (i.e., not executing in a signal handler)
SIGCHLD is enabled as a postcondition of allow_othersigs.

All *other* occurrences of allow_sigvtalrm are of course replaced with 

allow_sigvtalrm(); allow_othersigs();

The goal is that SIGCHLD be enabled precisely when SIGVTALRM is except inside
the signal handler itself.

     Mika


Tony Hosking writes:
>Sorry, I missed you e-mail describing this change.  Looks good.
>
>Antony Hosking | Associate Professor | Computer Science | Purdue =
>University
>305 N. University Street | West Lafayette | IN 47907 | USA
>Office +1 765 494 6001 | Mobile +1 765 427 5484
>
>
>
>
>On Feb 13, 2011, at 3:37 PM, Mika Nystrom wrote:
>
>> Tony,
>>=20
>> The old user threads has an old problem with Process.Wait.  It has a=20=
>
>> Thread.Pause(0.1d0) in it.  This causes the compiler to run three =
>times
>> slower than it should.  The m3build I've been using for years has
>> that Thread.Pause removed (so it uses 100% CPU instead, =
>busy-waiting...)
>>=20
>> But yeah user threads works the same as it used to (except for=20
>> now calling pthread_atfork, which I'll return to in another email...)
>>=20
>>     Mika
>>=20
>> Tony Hosking writes:
>>> Hi Mika, Why was this change needed (I haven't looked closely)?  I =
>don't =3D
>>> think we diverged much from previous implementations of user threads, =
>=3D
>>> which presumably used to work for you.
>>>=20
>>> On Feb 12, 2011, at 3:44 PM, Mika Nystrom wrote:
>>>=20
>>>> =3D20
>>>> Index: ThreadPosix.i3
>>>> =
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3=
>D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D
>>> =
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3=
>D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D
>>> =3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
>3D=3D3D=3D3D=3D3D
>>>> RCS file: =3D
>>> /usr/cvs/cm3/m3-libs/m3core/src/thread/POSIX/ThreadPosix.i3,v
>>>> retrieving revision 1.16
>>>> diff -c -r1.16 ThreadPosix.i3
>>>> *** ThreadPosix.i3	14 Apr 2010 09:53:34 -0000	1.16
>>>> --- ThreadPosix.i3	12 Feb 2011 20:43:26 -0000
>>>> ***************
>>>> *** 39,42 ****
>>>> --- 39,45 ----
>>>> =3D20
>>>> =3D
>>> =
>(*------------------------------------------------------------------------=
>=3D
>>> ---*)
>>>> =3D20
>>>> + <*EXTERNAL ThreadPosix__value_of_SIGCHLD*>
>>>> + PROCEDURE ValueOfSIGCHLD(): int;
>>>> +=3D20
>>>> END ThreadPosix.
>>>> Index: ThreadPosix.m3
>>>> =
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3=
>D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D
>>> =
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3=
>D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D
>>> =3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
>3D=3D3D=3D3D=3D3D
>>>> RCS file: =3D
>>> /usr/cvs/cm3/m3-libs/m3core/src/thread/POSIX/ThreadPosix.m3,v
>>>> retrieving revision 1.73
>>>> diff -c -r1.73 ThreadPosix.m3
>>>> *** ThreadPosix.m3	28 Dec 2010 10:13:46 -0000	1.73
>>>> --- ThreadPosix.m3	12 Feb 2011 20:43:26 -0000
>>>> ***************
>>>> *** 97,102 ****
>>>> --- 97,105 ----
>>>>     (* if state =3D3D pausing, the time at which we can restart *)
>>>>     waitingForTime: Time.T;
>>>> =3D20
>>>> +     (* if state =3D3D pausing, the signal that truncates the pause =
>*)
>>>> +     waitingForSig: int :=3D3D -1;
>>>> +=3D20
>>>>     (* true if we are waiting during an AlertWait or AlertJoin=3D20
>>>>        or AlertPause *)
>>>>     alertable: BOOLEAN :=3D3D FALSE;
>>>> ***************
>>>> *** 147,152 ****
>>>> --- 150,160 ----
>>>> =3D20
>>>>   defaultStackSize :=3D3D 3000;
>>>> =3D20
>>>> +   (* note that even though the "heavy machinery" is only used =
>for=3D20
>>>> +      multipleThreads, we still need to set up the signal handler =
>so
>>>> +      that we can catch signals from other sources than thread =3D
>>> switching.
>>>> +      e.g., we use SIGCHLD to speed up Process.Wait *)
>>>> +=3D20
>>>> VAR
>>>>   stats: RECORD
>>>>            n_forks :=3D3D 0;
>>>> ***************
>>>> *** 511,527 ****
>>>>     XPause(until, FALSE);
>>>>   END Pause;
>>>> =3D20
>>>> PROCEDURE AlertPause(n: LONGREAL) RAISES {Alerted}=3D3D
>>>>   VAR until :=3D3D n + Time.Now ();
>>>>   BEGIN
>>>>     XPause(until, TRUE);
>>>>   END AlertPause;
>>>> =3D20
>>>> ! PROCEDURE XPause (READONLY until: Time.T; alertable :=3D3D FALSE) =
>=3D
>>> RAISES {Alerted} =3D3D
>>>>   BEGIN
>>>>     INC (inCritical);
>>>>       self.waitingForTime :=3D3D until;
>>>>       self.alertable :=3D3D alertable;
>>>>       ICannotRun (State.pausing);
>>>>     DEC (inCritical);
>>>>     InternalYield ();
>>>> --- 519,546 ----
>>>>     XPause(until, FALSE);
>>>>   END Pause;
>>>> =3D20
>>>> + PROCEDURE SigPause(n: LONGREAL; sig: int)=3D3D
>>>> +   <*FATAL Alerted*>
>>>> +   VAR until :=3D3D n + Time.Now ();
>>>> +   BEGIN
>>>> +     XPause(until, FALSE, sig);
>>>> +   END SigPause;
>>>> +=3D20
>>>> PROCEDURE AlertPause(n: LONGREAL) RAISES {Alerted}=3D3D
>>>>   VAR until :=3D3D n + Time.Now ();
>>>>   BEGIN
>>>>     XPause(until, TRUE);
>>>>   END AlertPause;
>>>> =3D20
>>>> ! PROCEDURE XPause (READONLY until: Time.T; alertable :=3D3D FALSE; =
>=3D
>>> sig:int :=3D3D -1)=3D20
>>>> !   RAISES {Alerted} =3D3D
>>>>   BEGIN
>>>>     INC (inCritical);
>>>>       self.waitingForTime :=3D3D until;
>>>>       self.alertable :=3D3D alertable;
>>>> +       IF sig # -1 THEN
>>>> +         self.waitingForSig :=3D3D sig
>>>> +       END;
>>>>       ICannotRun (State.pausing);
>>>>     DEC (inCritical);
>>>>     InternalYield ();
>>>> ***************
>>>> *** 703,712 ****
>>>>     END;
>>>>   END StartSwitching;
>>>> =3D20
>>>> ! PROCEDURE switch_thread (<*UNUSED*> sig: int) RAISES {Alerted} =3D3D=
>
>>>>   BEGIN
>>>>     allow_sigvtalrm ();
>>>> !     IF inCritical =3D3D 0 AND heapState.inCritical =3D3D 0 THEN =3D
>>> InternalYield () END;
>>>>   END switch_thread;
>>>> =3D20
>>>> (*------------------------------------------------------------- =3D
>>> scheduler ---*)
>>>> --- 722,750 ----
>>>>     END;
>>>>   END StartSwitching;
>>>> =3D20
>>>> ! CONST MaxSigs =3D3D 64;
>>>> ! TYPE Sig =3D3D [ 0..MaxSigs-1 ];
>>>> !=3D20
>>>> ! (* in order to listen to other signals, they have to be enabled in
>>>> !    allow_sigvtalrm as well *)
>>>> ! VAR (*CONST*) SIGCHLD :=3D3D ValueOfSIGCHLD();
>>>> !              =3D20
>>>> !     gotSigs :=3D3D SET OF Sig { };
>>>> !=3D20
>>>> ! PROCEDURE switch_thread (sig: int) RAISES {Alerted} =3D3D
>>>>   BEGIN
>>>>     allow_sigvtalrm ();
>>>> !=3D20
>>>> !     INC(inCritical);
>>>> !     (* mark signal as being delivered *)
>>>> !     IF sig >=3D3D 0 AND sig < MaxSigs THEN
>>>> !       gotSigs :=3D3D gotSigs + SET OF Sig { sig }
>>>> !     END;
>>>> !     DEC(inCritical);
>>>> !=3D20
>>>> !     IF inCritical =3D3D 0 AND heapState.inCritical =3D3D 0 THEN=3D20=
>
>>>> !       InternalYield ()=3D20
>>>> !     END;
>>>>   END switch_thread;
>>>> =3D20
>>>> (*------------------------------------------------------------- =3D
>>> scheduler ---*)
>>>> ***************
>>>> *** 782,792 ****
>>>>             IF t.alertable AND t.alertPending THEN
>>>>               CanRun (t);
>>>>               EXIT;
>>>> =3D20
>>>>             ELSIF t.waitingForTime <=3D3D now THEN
>>>>               CanRun (t);
>>>>               EXIT;
>>>> !  =3D20
>>>>             ELSIF NOT somePausing THEN
>>>>               earliest :=3D3D t.waitingForTime;
>>>>               somePausing :=3D3D TRUE;
>>>> --- 820,835 ----
>>>>             IF t.alertable AND t.alertPending THEN
>>>>               CanRun (t);
>>>>               EXIT;
>>>> +              =3D20
>>>> +             ELSIF t.waitingForSig IN gotSigs THEN
>>>> +               t.waitingForSig :=3D3D -1;
>>>> +               CanRun(t);
>>>> +               EXIT;
>>>> =3D20
>>>>             ELSIF t.waitingForTime <=3D3D now THEN
>>>>               CanRun (t);
>>>>               EXIT;
>>>> !=3D20
>>>>             ELSIF NOT somePausing THEN
>>>>               earliest :=3D3D t.waitingForTime;
>>>>               somePausing :=3D3D TRUE;
>>>> ***************
>>>> *** 886,891 ****
>>>> --- 929,936 ----
>>>>       END;
>>>>     END;
>>>> =3D20
>>>> +     gotSigs :=3D3D SET OF Sig {};
>>>> +=3D20
>>>>     IF t.state =3D3D State.alive AND (scanned OR NOT someBlocking) =
>THEN
>>>>       IF perfOn THEN PerfRunning (t.id); END;
>>>>       (* At least one thread wants to run; transfer to it *)
>>>> ***************
>>>> *** 948,960 ****
>>>> =3D20
>>>> PROCEDURE WaitProcess (pid: int; VAR status: int): int =3D3D
>>>>   (* ThreadPThread.m3 and ThreadPosix.m3 are very similar. *)
>>>> !   CONST Delay =3D3D 0.1D0;
>>>>   BEGIN
>>>>     LOOP
>>>>       WITH r =3D3D Uexec.waitpid(pid, ADR(status), Uexec.WNOHANG) DO
>>>>         IF r # 0 THEN RETURN r END;
>>>>       END;
>>>> !       Pause(Delay);
>>>>     END;
>>>>   END WaitProcess;
>>>> =3D20
>>>> --- 993,1005 ----
>>>> =3D20
>>>> PROCEDURE WaitProcess (pid: int; VAR status: int): int =3D3D
>>>>   (* ThreadPThread.m3 and ThreadPosix.m3 are very similar. *)
>>>> !   CONST Delay =3D3D 10.0D0;
>>>>   BEGIN
>>>>     LOOP
>>>>       WITH r =3D3D Uexec.waitpid(pid, ADR(status), Uexec.WNOHANG) DO
>>>>         IF r # 0 THEN RETURN r END;
>>>>       END;
>>>> !       SigPause(Delay,SIGCHLD);
>>>>     END;
>>>>   END WaitProcess;
>>>> =3D20
>>>> ***************
>>>> *** 1361,1364 ****
>>>> --- 1406,1412 ----
>>>> VAR debug :=3D3D RTParams.IsPresent ("debugthreads");
>>>> =3D20
>>>> BEGIN
>>>> +   (* we need to call set up the signal handler for other reasons =3D=
>
>>> than
>>>> +      just thread switching now *)
>>>> +   setup_sigvtalrm (switch_thread);
>>>> END ThreadPosix.
>>>> Index: ThreadPosixC.c
>>>> =
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3=
>D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D
>>> =
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3=
>D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D
>>> =3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
>3D=3D3D=3D3D=3D3D
>>>> RCS file: =3D
>>> /usr/cvs/cm3/m3-libs/m3core/src/thread/POSIX/ThreadPosixC.c,v
>>>> retrieving revision 1.59
>>>> diff -c -r1.59 ThreadPosixC.c
>>>> *** ThreadPosixC.c	12 Feb 2011 00:56:32 -0000	1.59
>>>> --- ThreadPosixC.c	12 Feb 2011 20:43:26 -0000
>>>> ***************
>>>> *** 78,88 ****
>>>> --- 78,97 ----
>>>> =3D20
>>>>   sigemptyset(&ThreadSwitchSignal);
>>>>   sigaddset(&ThreadSwitchSignal, SIG_TIMESLICE);
>>>> +   sigaddset(&ThreadSwitchSignal, SIGCHLD);
>>>> =3D20
>>>>   act.sa_handler =3D3D handler;
>>>>   act.sa_flags =3D3D SA_RESTART;
>>>>   sigemptyset(&(act.sa_mask));
>>>>   if (sigaction (SIG_TIMESLICE, &act, NULL)) abort();
>>>> +   if (sigaction (SIGCHLD, &act, NULL)) abort();
>>>> + }
>>>> +=3D20
>>>> + int
>>>> + __cdecl
>>>> + ThreadPosix__value_of_SIGCHLD(void)
>>>> + {
>>>> + 	return SIGCHLD;
>>>> }
>>>> =3D20
>>>> void



More information about the M3devel mailing list