[M3devel] Performance issues with Process.Wait under userthreads
Mika Nystrom
mika at async.caltech.edu
Sun Feb 13 07:29:55 CET 2011
Jay K writes:
>--_5ec25af1-29a7-4f8a-b7ad-e61068941159_
>Content-Type: text/plain; charset="iso-8859-1"
>Content-Transfer-Encoding: quoted-printable
>
>
>I don't believe there is a good solution here=2C other than using pthreads.
>Is there? Maybe tune the wait down smaller?
>I understand it stinks either way.
Well as I keep saying, I'm looking for a reliable runtime. I really
need this---and soon. I can't keep futzing around with PM3 forever.
I'd hope pthreads is it but for now the only thing that passes my tests
is user threading.
We (group at Caltech) determined many years ago that tuning the wait does
nothing, unless you set it to zero (busy waiting). Possibly this has
to do with the time quantum of the user threads scheduler, I don't know.
>=20
>=20
>I don't understand the SIGCHLD stuff=2C at a quick glance.
The idea is this.
We have SIGVTALRM coming in already to do threadswitching. So I
am enabling SIGCHLD and then taking SIGCHLD the way SIGVTALRM is
taken, that is, in the scheduler. The other change is that a paused
thread can wait for a condition such as "either after X delay OR
after Y signal" and then we allow the paused thread to become
runnable on the Y signal rather than after the X delay.
>Nor your changes.
>However you might be the only user of user threads.
>Is there any downside to your change?
> (Other than more code=2C ok.)
>Any loss of performance in any situation?
>Any loss of portability?
I am not sure. Is SIGCHLD completely standard?
I am worried about one thing. The thread switcher turns signals back
on as its first action. If a large number of children exit at the
same time, I fear that we get a very deeply nested signal stack, which
doesn't sound good. Perhaps signals should not be turned back on
immediately. Or SIGCHLD should be set to SIG_IGN and then re-enabled?
(Simulating old-fashioned "unreliable" SysV signals!) I'm not sure.
In any case I was just trying to parallelize the compiler and ran into
some very odd things related to having lots of child processes running
around...
The SIGCHLD change speeds up the CM3 compiler by about a factor of 3
when using user threads.
Mika
>=20
>=20
> - Jay
>=20
>> To: m3devel at elegosoft.com
>> Date: Sat=2C 12 Feb 2011 12:43:02 -0800
>> From: mika at async.caltech.edu
>> Subject: [M3devel] Performance issues with Process.Wait under userthreads
>>=20
>> Hi again m3devel (especially Tony)=2C=20
>>=20
>> I have finally taken a bite of a problem that has been annoying me
>> for a very=2C very long time.
>>=20
>> Under user threads=2C the code for Process.Wait is as follows (see=20
>> ThreadPosix.m3):
>>=20
>> PROCEDURE WaitProcess (pid: int=3B VAR status: int): int =3D
>> (* ThreadPThread.m3 and ThreadPosix.m3 are very similar. *)
>> CONST Delay =3D 0.1D0=3B
>> BEGIN
>> LOOP
>> WITH r =3D Uexec.waitpid(pid=2C ADR(status)=2C Uexec.WNOHANG) DO
>> IF r # 0 THEN RETURN r END=3B
>> END=3B
>> Pause(Delay)=3B
>> END=3B
>> END WaitProcess=3B
>>=20
>> It inserts a 0.1 second delay after each failed waitpid. This is extremel=
>y
>> annoying for programs that start a long sequence of child processes and
>> wait for them in sequence. Namely=2C the compiler itself. As a result
>> the cm3 compiler (and PM3's m3build) are normally very very slow when
>> using user threads. For about the last ten years=2C I've had a hacked
>> up m3build (for my PM3 installation) that skips the Pause and busy-waits
>> instead.=20
>>=20
>> Note there is another problem here. Since the Modula-3 runtime ignores
>> SIGCHLD=2C no zombie processes are created since the Unix system automati=
>cally
>> reaps the child processes. I can see this would be a problem since PIDs
>> are eventually reused and ... couldn't Uexec.waitpid wind up referring to
>> the wrong process??
>>=20
>> I will further note that the comment in Process.i3 reads as follows:
>>=20
>> PROCEDURE Wait(p: T): ExitCode=3B
>>=20
>> Wait until the process with handle p terminates=2C then free the operatin=
>g system resources associated with the process and return an exit code indi=
>cating the reason for its termination. It is a checked runtime error to cal=
>l Wait twice on the same process handle.=20
>>=20
>> I am going to take this as fair warning that Process.Create *may* use
>> resources that are not going to be released until Process.Wait has been
>> called.
>>=20
>> I have modified (in my local copy of CM3) the system as follows.
>> I have come up with a semi-general mechanism for immediately unblocking
>> a thread on the receipt of a unix signal.=20
>>=20
>> 1. The system relies on changing
>> ThreadPosix.XPause such that if a signal is allowed to wake up a threads=
>=2C
>> that fact is recorded in a new field in the thread's descriptor
>> record (of type ThreadPosix.T).=20
>>=20
>> 2. On receipt of a waited-for unix signal=2C a mask is set and control
>> is passed to the thread scheduler which maintains the non-zero mask for
>> exactly one iteration through the thread ring.
>>=20
>> 3. If a thread is paused and waiting for EITHER a signal or some time=2C
>> the thread is released for running and the thread's waiting state is=20
>> cleared.
>>=20
>> The changes are more or less as follows:
>>=20
>> 1. I have added a new field of type "int" to ThreadPosix.T:
>>=20
>> (* if state =3D pausing=2C the time at which we can restart *)
>> waitingForTime: Time.T=3B
>>=20
>> + (* if state =3D pausing=2C the signal that truncates the pause *)
>> + waitingForSig: int :=3D -1=3B
>> +
>> (* true if we are waiting during an AlertWait or AlertJoin
>> or AlertPause *)
>> alertable: BOOLEAN :=3D FALSE=3B
>>=20
>>=20
>> 2. Modifications to pause:
>>=20
>> + PROCEDURE SigPause(n: LONGREAL=3B sig: int)=3D
>> + <*FATAL Alerted*>
>> + VAR until :=3D n + Time.Now ()=3B
>> + BEGIN
>> + XPause(until=2C FALSE=2C sig)=3B
>> + END SigPause=3B
>> +
>> PROCEDURE AlertPause(n: LONGREAL) RAISES {Alerted}=3D
>> VAR until :=3D n + Time.Now ()=3B
>> BEGIN
>> XPause(until=2C TRUE)=3B
>> END AlertPause=3B
>>=20
>> ! PROCEDURE XPause (READONLY until: Time.T=3B alertable :=3D FALSE=3B sig=
>:int :=3D -1)
>> ! RAISES {Alerted} =3D
>> BEGIN
>> INC (inCritical)=3B
>> self.waitingForTime :=3D until=3B
>> self.alertable :=3D alertable=3B
>> + IF sig # -1 THEN
>> + self.waitingForSig :=3D sig
>> + END=3B
>> ICannotRun (State.pausing)=3B
>> DEC (inCritical)=3B
>> InternalYield ()=3B
>>=20
>> 3. The received-signals mask:
>>=20
>> ! CONST MaxSigs =3D 64=3B
>> ! TYPE Sig =3D [ 0..MaxSigs-1 ]=3B
>> !
>> ! (* in order to listen to other signals=2C they have to be enabled in
>> ! allow_sigvtalrm as well *)
>> ! VAR (*CONST*) SIGCHLD :=3D ValueOfSIGCHLD()=3B
>> !
>> ! gotSigs :=3D SET OF Sig { }=3B
>> !
>>=20
>> ValueOfSIGCHLD() is a C function used to get the value of the SIGCHLD
>> constant without guessing at it (in ThreadPosixC.c).
>>=20
>> 4. changes to the signal handler:
>>=20
>> ! PROCEDURE switch_thread (sig: int) RAISES {Alerted} =3D
>> BEGIN
>> allow_sigvtalrm ()=3B
>> !
>> ! INC(inCritical)=3B
>> ! (* mark signal as being delivered *)
>> ! IF sig >=3D 0 AND sig < MaxSigs THEN
>> ! gotSigs :=3D gotSigs + SET OF Sig { sig }
>> ! END=3B
>> ! DEC(inCritical)=3B
>> !
>> ! IF inCritical =3D 0 AND heapState.inCritical =3D 0 THEN
>> ! InternalYield ()
>> ! END=3B
>> END switch_thread=3B
>>=20
>> Note that I don't know if INC/DEC(inCritical) does exactly the right
>> thing here.
>>=20
>> 5. changes to the scheduler:
>>=20
>> a. thread wakeup
>> IF t.alertable AND t.alertPending THEN
>> CanRun (t)=3B
>> EXIT=3B
>> +
>> + ELSIF t.waitingForSig IN gotSigs THEN
>> + t.waitingForSig :=3D -1=3B
>> + CanRun(t)=3B
>> + EXIT=3B
>>=20
>> ELSIF t.waitingForTime <=3D now THEN
>> CanRun (t)=3B
>> EXIT=3B
>> !
>>=20
>> b. clearing the mask
>>=20
>> END=3B
>> END=3B
>>=20
>> + gotSigs :=3D SET OF Sig {}=3B
>> +
>> IF t.state =3D State.alive AND (scanned OR NOT someBlocking) THEN
>> IF perfOn THEN PerfRunning (t.id)=3B END=3B
>> (* At least one thread wants to run=3B transfer to it *)
>>=20
>> 6. changes to WaitProcess (Process.Wait):
>>=20
>> PROCEDURE WaitProcess (pid: int=3B VAR status: int): int =3D
>> (* ThreadPThread.m3 and ThreadPosix.m3 are very similar. *)
>> ! CONST Delay =3D 10.0D0=3B
>> BEGIN
>> LOOP
>> WITH r =3D Uexec.waitpid(pid=2C ADR(status)=2C Uexec.WNOHANG) DO
>> IF r # 0 THEN RETURN r END=3B
>> END=3B
>> ! SigPause(Delay=2CSIGCHLD)=3B
>> END=3B
>> END WaitProcess=3B
>>=20
>> 7. install signal handler even if program is single-threaded:
>>=20
>> BEGIN
>> + (* we need to call set up the signal handler for other reasons than
>> + just thread switching now *)
>> + setup_sigvtalrm (switch_thread)=3B
>> END ThreadPosix.
>>=20
>> 8. modify signal handler in ThreadPosixC.c to catch SIGCHLD:
>>=20
>> sigemptyset(&ThreadSwitchSignal)=3B
>> sigaddset(&ThreadSwitchSignal=2C SIG_TIMESLICE)=3B
>> + sigaddset(&ThreadSwitchSignal=2C SIGCHLD)=3B
>>=20
>> act.sa_handler =3D handler=3B
>> act.sa_flags =3D SA_RESTART=3B
>> sigemptyset(&(act.sa_mask))=3B
>> if (sigaction (SIG_TIMESLICE=2C &act=2C NULL)) abort()=3B
>> + if (sigaction (SIGCHLD=2C &act=2C NULL)) abort()=3B
>>=20
>> I'll send the complete diff in a separate message for those who want
>> to study it more closely.
>>=20
>> I propose the above changes for inclusion in the current CM3 repository.
>>=20
>> Mika
>>=20
> =
>
>--_5ec25af1-29a7-4f8a-b7ad-e61068941159_
>Content-Type: text/html; charset="iso-8859-1"
>Content-Transfer-Encoding: quoted-printable
>
><html>
><head>
><style><!--
>.hmmessage P
>{
>margin:0px=3B
>padding:0px
>}
>body.hmmessage
>{
>font-size: 10pt=3B
>font-family:Tahoma
>}
>--></style>
></head>
><body class=3D'hmmessage'>
>I don't believe there is a good solution here=2C other than using pthreads.=
><BR>
>Is there? Maybe tune the wait down smaller?<BR>
>I understand it stinks either way.<BR>
> =3B<BR>
> =3B<BR>
>I don't =3Bunderstand the SIGCHLD stuff=2C at a quick glance.<BR>
>Nor your changes.<BR>
>However you might be the only user of user threads.<BR>
>Is there any downside to your change?<BR>
> =3B(Other than more code=2C ok.)<BR>
>Any loss of performance in any situation?<BR>
>Any loss of portability?<BR>
> =3B<BR>
> =3B<BR>
> =3B- Jay<BR> =3B<BR>
>>=3B To: m3devel at elegosoft.com<BR>>=3B Date: Sat=2C 12 Feb 2011 12:43:0=
>2 -0800<BR>>=3B From: mika at async.caltech.edu<BR>>=3B Subject: [M3devel]=
> Performance issues with Process.Wait under userthreads<BR>>=3B <BR>>=
>=3B Hi again m3devel (especially Tony)=2C <BR>>=3B <BR>>=3B I have fina=
>lly taken a bite of a problem that has been annoying me<BR>>=3B for a ver=
>y=2C very long time.<BR>>=3B <BR>>=3B Under user threads=2C the code fo=
>r Process.Wait is as follows (see <BR>>=3B ThreadPosix.m3):<BR>>=3B <BR=
>>>=3B PROCEDURE WaitProcess (pid: int=3B VAR status: int): int =3D<BR>>=
>=3B (* ThreadPThread.m3 and ThreadPosix.m3 are very similar. *)<BR>>=3B C=
>ONST Delay =3D 0.1D0=3B<BR>>=3B BEGIN<BR>>=3B LOOP<BR>>=3B WITH r =3D=
> Uexec.waitpid(pid=2C ADR(status)=2C Uexec.WNOHANG) DO<BR>>=3B IF r # 0 T=
>HEN RETURN r END=3B<BR>>=3B END=3B<BR>>=3B Pause(Delay)=3B<BR>>=3B EN=
>D=3B<BR>>=3B END WaitProcess=3B<BR>>=3B <BR>>=3B It inserts a 0.1 sec=
>ond delay after each failed waitpid. This is extremely<BR>>=3B annoying f=
>or programs that start a long sequence of child processes and<BR>>=3B wai=
>t for them in sequence. Namely=2C the compiler itself. As a result<BR>>=
>=3B the cm3 compiler (and PM3's m3build) are normally very very slow when<B=
>R>>=3B using user threads. For about the last ten years=2C I've had a hac=
>ked<BR>>=3B up m3build (for my PM3 installation) that skips the Pause and=
> busy-waits<BR>>=3B instead. <BR>>=3B <BR>>=3B Note there is another =
>problem here. Since the Modula-3 runtime ignores<BR>>=3B SIGCHLD=2C no zo=
>mbie processes are created since the Unix system automatically<BR>>=3B re=
>aps the child processes. I can see this would be a problem since PIDs<BR>&g=
>t=3B are eventually reused and ... couldn't Uexec.waitpid wind up referring=
> to<BR>>=3B the wrong process??<BR>>=3B <BR>>=3B I will further note =
>that the comment in Process.i3 reads as follows:<BR>>=3B <BR>>=3B PROCE=
>DURE Wait(p: T): ExitCode=3B<BR>>=3B <BR>>=3B Wait until the process wi=
>th handle p terminates=2C then free the operating system resources associat=
>ed with the process and return an exit code indicating the reason for its t=
>ermination. It is a checked runtime error to call Wait twice on the same pr=
>ocess handle. <BR>>=3B <BR>>=3B I am going to take this as fair warning=
> that Process.Create *may* use<BR>>=3B resources that are not going to be=
> released until Process.Wait has been<BR>>=3B called.<BR>>=3B <BR>>=
>=3B I have modified (in my local copy of CM3) the system as follows.<BR>>=
>=3B I have come up with a semi-general mechanism for immediately unblocking=
><BR>>=3B a thread on the receipt of a unix signal. <BR>>=3B <BR>>=3B =
>1. The system relies on changing<BR>>=3B ThreadPosix.XPause such that if =
>a signal is allowed to wake up a threads=2C<BR>>=3B that fact is recorded=
> in a new field in the thread's descriptor<BR>>=3B record (of type Thread=
>Posix.T). <BR>>=3B <BR>>=3B 2. On receipt of a waited-for unix signal=
>=2C a mask is set and control<BR>>=3B is passed to the thread scheduler w=
>hich maintains the non-zero mask for<BR>>=3B exactly one iteration throug=
>h the thread ring.<BR>>=3B <BR>>=3B 3. If a thread is paused and waitin=
>g for EITHER a signal or some time=2C<BR>>=3B the thread is released for =
>running and the thread's waiting state is <BR>>=3B cleared.<BR>>=3B <BR=
>>>=3B The changes are more or less as follows:<BR>>=3B <BR>>=3B 1. I =
>have added a new field of type "int" to ThreadPosix.T:<BR>>=3B <BR>>=3B=
> (* if state =3D pausing=2C the time at which we can restart *)<BR>>=3B w=
>aitingForTime: Time.T=3B<BR>>=3B <BR>>=3B + (* if state =3D pausing=2C =
>the signal that truncates the pause *)<BR>>=3B + waitingForSig: int :=3D =
>-1=3B<BR>>=3B +<BR>>=3B (* true if we are waiting during an AlertWait o=
>r AlertJoin<BR>>=3B or AlertPause *)<BR>>=3B alertable: BOOLEAN :=3D FA=
>LSE=3B<BR>>=3B <BR>>=3B <BR>>=3B 2. Modifications to pause:<BR>>=3B=
> <BR>>=3B + PROCEDURE SigPause(n: LONGREAL=3B sig: int)=3D<BR>>=3B + &l=
>t=3B*FATAL Alerted*>=3B<BR>>=3B + VAR until :=3D n + Time.Now ()=3B<BR>=
>>=3B + BEGIN<BR>>=3B + XPause(until=2C FALSE=2C sig)=3B<BR>>=3B + END=
> SigPause=3B<BR>>=3B +<BR>>=3B PROCEDURE AlertPause(n: LONGREAL) RAISES=
> {Alerted}=3D<BR>>=3B VAR until :=3D n + Time.Now ()=3B<BR>>=3B BEGIN<B=
>R>>=3B XPause(until=2C TRUE)=3B<BR>>=3B END AlertPause=3B<BR>>=3B <BR=
>>>=3B ! PROCEDURE XPause (READONLY until: Time.T=3B alertable :=3D FALSE=
>=3B sig:int :=3D -1)<BR>>=3B ! RAISES {Alerted} =3D<BR>>=3B BEGIN<BR>&g=
>t=3B INC (inCritical)=3B<BR>>=3B self.waitingForTime :=3D until=3B<BR>>=
>=3B self.alertable :=3D alertable=3B<BR>>=3B + IF sig # -1 THEN<BR>>=3B=
> + self.waitingForSig :=3D sig<BR>>=3B + END=3B<BR>>=3B ICannotRun (Sta=
>te.pausing)=3B<BR>>=3B DEC (inCritical)=3B<BR>>=3B InternalYield ()=3B<=
>BR>>=3B <BR>>=3B 3. The received-signals mask:<BR>>=3B <BR>>=3B ! C=
>ONST MaxSigs =3D 64=3B<BR>>=3B ! TYPE Sig =3D [ 0..MaxSigs-1 ]=3B<BR>>=
>=3B !<BR>>=3B ! (* in order to listen to other signals=2C they have to be=
> enabled in<BR>>=3B ! allow_sigvtalrm as well *)<BR>>=3B ! VAR (*CONST*=
>) SIGCHLD :=3D ValueOfSIGCHLD()=3B<BR>>=3B !<BR>>=3B ! gotSigs :=3D SET=
> OF Sig { }=3B<BR>>=3B !<BR>>=3B <BR>>=3B ValueOfSIGCHLD() is a C fun=
>ction used to get the value of the SIGCHLD<BR>>=3B constant without guess=
>ing at it (in ThreadPosixC.c).<BR>>=3B <BR>>=3B 4. changes to the signa=
>l handler:<BR>>=3B <BR>>=3B ! PROCEDURE switch_thread (sig: int) RAISES=
> {Alerted} =3D<BR>>=3B BEGIN<BR>>=3B allow_sigvtalrm ()=3B<BR>>=3B !<=
>BR>>=3B ! INC(inCritical)=3B<BR>>=3B ! (* mark signal as being delivere=
>d *)<BR>>=3B ! IF sig >=3B=3D 0 AND sig <=3B MaxSigs THEN<BR>>=3B !=
> gotSigs :=3D gotSigs + SET OF Sig { sig }<BR>>=3B ! END=3B<BR>>=3B ! D=
>EC(inCritical)=3B<BR>>=3B !<BR>>=3B ! IF inCritical =3D 0 AND heapState=
>.inCritical =3D 0 THEN<BR>>=3B ! InternalYield ()<BR>>=3B ! END=3B<BR>&=
>gt=3B END switch_thread=3B<BR>>=3B <BR>>=3B Note that I don't know if I=
>NC/DEC(inCritical) does exactly the right<BR>>=3B thing here.<BR>>=3B <=
>BR>>=3B 5. changes to the scheduler:<BR>>=3B <BR>>=3B a. thread wakeu=
>p<BR>>=3B IF t.alertable AND t.alertPending THEN<BR>>=3B CanRun (t)=3B<=
>BR>>=3B EXIT=3B<BR>>=3B +<BR>>=3B + ELSIF t.waitingForSig IN gotSigs =
>THEN<BR>>=3B + t.waitingForSig :=3D -1=3B<BR>>=3B + CanRun(t)=3B<BR>>=
>=3B + EXIT=3B<BR>>=3B <BR>>=3B ELSIF t.waitingForTime <=3B=3D now THE=
>N<BR>>=3B CanRun (t)=3B<BR>>=3B EXIT=3B<BR>>=3B !<BR>>=3B <BR>>=
>=3B b. clearing the mask<BR>>=3B <BR>>=3B END=3B<BR>>=3B END=3B<BR>&g=
>t=3B <BR>>=3B + gotSigs :=3D SET OF Sig {}=3B<BR>>=3B +<BR>>=3B IF t.=
>state =3D State.alive AND (scanned OR NOT someBlocking) THEN<BR>>=3B IF p=
>erfOn THEN PerfRunning (t.id)=3B END=3B<BR>>=3B (* At least one thread wa=
>nts to run=3B transfer to it *)<BR>>=3B <BR>>=3B 6. changes to WaitProc=
>ess (Process.Wait):<BR>>=3B <BR>>=3B PROCEDURE WaitProcess (pid: int=3B=
> VAR status: int): int =3D<BR>>=3B (* ThreadPThread.m3 and ThreadPosix.m3=
> are very similar. *)<BR>>=3B ! CONST Delay =3D 10.0D0=3B<BR>>=3B BEGIN=
><BR>>=3B LOOP<BR>>=3B WITH r =3D Uexec.waitpid(pid=2C ADR(status)=2C Ue=
>xec.WNOHANG) DO<BR>>=3B IF r # 0 THEN RETURN r END=3B<BR>>=3B END=3B<BR=
>>>=3B ! SigPause(Delay=2CSIGCHLD)=3B<BR>>=3B END=3B<BR>>=3B END WaitP=
>rocess=3B<BR>>=3B <BR>>=3B 7. install signal handler even if program is=
> single-threaded:<BR>>=3B <BR>>=3B BEGIN<BR>>=3B + (* we need to call=
> set up the signal handler for other reasons than<BR>>=3B + just thread s=
>witching now *)<BR>>=3B + setup_sigvtalrm (switch_thread)=3B<BR>>=3B EN=
>D ThreadPosix.<BR>>=3B <BR>>=3B 8. modify signal handler in ThreadPosix=
>C.c to catch SIGCHLD:<BR>>=3B <BR>>=3B sigemptyset(&=3BThreadSwitchS=
>ignal)=3B<BR>>=3B sigaddset(&=3BThreadSwitchSignal=2C SIG_TIMESLICE)=
>=3B<BR>>=3B + sigaddset(&=3BThreadSwitchSignal=2C SIGCHLD)=3B<BR>>=
>=3B <BR>>=3B act.sa_handler =3D handler=3B<BR>>=3B act.sa_flags =3D SA_=
>RESTART=3B<BR>>=3B sigemptyset(&=3B(act.sa_mask))=3B<BR>>=3B if (sig=
>action (SIG_TIMESLICE=2C &=3Bact=2C NULL)) abort()=3B<BR>>=3B + if (si=
>gaction (SIGCHLD=2C &=3Bact=2C NULL)) abort()=3B<BR>>=3B <BR>>=3B I'=
>ll send the complete diff in a separate message for those who want<BR>>=
>=3B to study it more closely.<BR>>=3B <BR>>=3B I propose the above chan=
>ges for inclusion in the current CM3 repository.<BR>>=3B <BR>>=3B Mika<=
>BR>>=3B <BR> </body>
></html>=
>
>--_5ec25af1-29a7-4f8a-b7ad-e61068941159_--
More information about the M3devel
mailing list