[M3devel] variations of waitpid..?

Mika Nystrom mika at async.caltech.edu
Mon Jan 5 12:23:19 CET 2009


This is all really inside Process.Wait, right?

On user threads: I can report (from having tried it) that changing
the Wait to a small number doesn't really work.  Once you have
yielded, as far as the OS is concerned, you've already lost (at
least on FreeBSD).  The only thing that really works is busy waiting.

The "correct" thing to do on unix is to catch SIGCHLD (SIGCLD?)...

I started coding this up once, but I'm not sure I shared my code
with anyone?  I think it was Tony that pointed out that it was
unnecessary with pthreads...

     Mika

P.S. Happy New Year, everyone.



Jay writes:
>--_e7fe8011-c1fe-42de-b7fb-8ae54fa172fc_
>Content-Type: text/plain; charset="iso-8859-1"
>Content-Transfer-Encoding: quoted-printable
>
>
>I didn't read that well -- the deadlock risk in sysutils is not there.
>The bad perf is probably.
>=20
>=20
>To be clear (esp. for folks that might not already know about this=2C
>I didn't know about until fairly recently)=2C there are two options:
>=20
>=20
>waitpid(pid=2C flags =3D 0)
>move along
>=20
>=20
>or
>while (waitpid(pid=2C flags =3D nohang) !=3D 0)
>   sleep(some value)
>=20
>=20
>The second is what sysutils was doing=2C and works=2C doesn't deadlock=2C i=
>s deoptimized.
> m3core/libm3 did this for a long time as well. When I complained about per=
>f=2C it was pointed out to me.
>=20
>The first has deadlock potential with userthreads=2C but is ok and faster w=
>ith kernel threads.
>=20
>=20
>waitpid(flags =3D nohang =3D don't actually wait=2C just get the exit code=
>=2C if there is one) is a way to quickly poll if a process has ended=2C and=
> if so=2C get its exit code. In Win32 there are two seperate functions GetE=
>xitCodeProcess and WaitForSingleObject or WaitForMultipleObjects ("waiting"=
> is generalized across files=2C processes=2C sockets=2C files=2C threads=2C=
> semaphores=2C mutexes=2C events and more..but not critical sections..only =
>kernel objects). These have a bug too. One particular exit code is reserved=
> to mean "the process is still running"=2C but that is easily avoided by us=
>ing Wait first. I have seen code get confused by this though. Wait also acc=
>epts a timeout=2C 32bit unsigned milliseconds=2C including 0 and infinity=
>=2C so also can be used to poll. Win32 also defines the exit code to be 32b=
>its=2C whereas Posix only allows for 8 bits which can be an interop problem=
>. Perl on Win32 truncates exit codes to 8 bits=2C very bad. Unhandled excep=
>tions end up as "large" exit codes.
>=20
>Anyway...
>=20
>The problem with the polling approach=2C at least part of the problem=2C is=
> that if the child process isn't done when waitpid is first called=2C but f=
>inishes before sleep(whatever value) ends=2C we will still sleep for the fu=
>ll "whatever value". You only really want to sleep until the child process =
>is done=2C and no longer.
>=20
>=20
>Making just the first sleep shorter might be a good idea.
>You know=2C to handle processes that are short-lived=2C but not "zero" live=
>d.
>("zero" being the amount of time it takes for the code to proceed from fork=
>/exec to waitpid=2C surely much smaller than a small sleep() but longer tha=
>n no sleep).
>=20
>=20
>Calling just waitpid(flags =3D 0) could deadlock if=2C for example=2C a par=
>ent thread is writing to a child's stdin=2C and the child won't finish unti=
>l the parent has written all that it needs to. The parent and child process=
>=2C er=2C other threads in the parent process=2C need to be allowed to run =
>concurrently=2C for the sake of at least some reasonable scenarios.
>=20
>With kernelthreads=2C the implementation of waitpid knows about threads and=
> will itself=2C in a sense=2C do the poll/sleep=2C but not exactly that -- =
>it won't sleep beyond the child process finishing.
>=20
>=20
>Hopefully this makes sense and lets more folks understand the problem.
>=20
>=20
>What you can do=2C of course=2C is like:
>=20
>=20
>if kernelthreads
>  waitpid(flags =3D 0)
>else
>  while (waitpid(flags =3D nohang) !=3D 0)
>  sleep
>=20
>=20
>and that is basically what the code looks like now.
>=20
>The part "if kernelthreads" I propose be "if SchedulerPosix.DoesWaitPidYiel=
>d()"
>though a really direct "if Thread or Scheduler.KernelThreads" might be reas=
>onable.
>Up to folks then to decide what that implies..
>=20
> - Jay> Date: Fri=2C 2 Jan 2009 11:27:24 +0100> From: wagner at elegosoft.com>=
> To: m3devel at elegosoft.com> Subject: Re: [M3devel] variations of waitpid..?=
>> > Quoting Tony Hosking <hosking at cs.purdue.edu>:> > > If someone uses wait=
>pid they get what they paid for.> It is so long ago that we wrote those sys=
>utils routines...> They have only ever be used in simple command line utili=
>ties (like cm3)> without much concurrency=2C I think. If there is potential=
> for deadlocks> and bad performance=2C we should at least document that in =
>the interfaces.> > I am not up-to-date wrt. the M3 system interfaces and th=
>reads> implementation: is there a way for a thread to wait for the exit cod=
>e> of another process without blocking other threads? If so=2C I'll adapt> =
>the sysutils code... If not=2C can we introduce such an interface in> m3cor=
>e/libm3?> > Olaf> > > On 1 Jan 2009=2C at 06:24=2C Jay wrote:> >> >>> >> Yo=
>u mean=2C this function is easy to misuse?> >>> People who declare their ow=
>n <*EXTERNAL*>> >> Like waitpid exposed from m3core?> >>> >> waitpid is alr=
>eady easy to misuse=2C on a userthread system=2C leading > >> to possible (=
>though I think rare) deadlock.> >> It is easy to misuse on pthreads=2C lead=
> "just" to bad performance=2C > >> and in fact I believe cm3 is doing this=
>=2C via sysutils.> >> This at least guides you between two patterns of use=
>=2C and fix the > >> perf of cm3/sysutils.> >>> >> On a userthread system=
>=2C waitpid(pid=2C flags =3D 0) waits for the child > >> process=2C with al=
>l parent threads suspended.> >> Generally I doubt the child depends on pare=
>nt threads progressing=2C > >> but=2C yeah=2C that could deadlock=2C like i=
>f a parent thread is waiting > >> to a file or stdin of the child=2C or rea=
>ding a child's stdout.> >>> >> The various uses do waitpid(pid=2C flags =3D=
> nohang) and then sleep and > >> try again.> >>> >> pthreads just uses wait=
>pid(pid=2C flags =3D 0) and all threads keep running> > > > -- > Olaf Wagne=
>r -- elego Software Solutions GmbH> Gustav-Meyer-Allee 25 / Geb=E4ude 12=2C=
> 13355 Berlin=2C Germany> phone: +49 30 23 45 86 96 mobile: +49 177 2345 86=
>9 fax: +49 30 23 45 86 95> http://www.elegosoft.com | Gesch=E4ftsf=FChrer: =
>Olaf Wagner | Sitz: Berlin> Handelregister: Amtsgericht Charlottenburg HRB =
>77719 | USt-IdNr: DE163214194> =
>
>--_e7fe8011-c1fe-42de-b7fb-8ae54fa172fc_
>Content-Type: text/html; charset="iso-8859-1"
>Content-Transfer-Encoding: quoted-printable
>
><html>
><head>
><style>
>.hmmessage P
>{
>margin:0px=3B
>padding:0px
>}
>body.hmmessage
>{
>font-size: 10pt=3B
>font-family:Verdana
>}
></style>
></head>
><body class=3D'hmmessage'>
>I didn't read that well -- the deadlock risk in sysutils is not there.<BR>
>The bad perf is probably.<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>To be clear (esp. for folks that might not already know about this=2C<BR>
>I didn't know about until fairly recently)=2C there are two options:<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>waitpid(pid=2C flags =3D 0)<BR>
>move along<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>or<BR>
>while (waitpid(pid=2C flags =3D nohang) !=3D 0)<BR>
>&nbsp=3B&nbsp=3B sleep(some value)<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>The second is what sysutils was doing=2C and works=2C doesn't deadlock=2C i=
>s deoptimized.<BR>
>&nbsp=3Bm3core/libm3 did this for a long time as well. When I complained ab=
>out perf=2C it was pointed out to me.<BR>
>&nbsp=3B<BR>
>The first has deadlock potential with userthreads=2C but is ok and faster w=
>ith kernel threads.<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>waitpid(flags =3D nohang =3D&nbsp=3Bdon't actually wait=2C just get the exi=
>t code=2C if there is one) is a way to quickly poll if a process has ended=
>=2C and if so=2C get its exit code. In Win32 there are two seperate functio=
>ns GetExitCodeProcess and WaitForSingleObject or WaitForMultipleObjects ("w=
>aiting" is generalized across files=2C processes=2C sockets=2C files=2C thr=
>eads=2C semaphores=2C mutexes=2C events and more..but not critical sections=
>..only kernel objects). These have a bug too. One particular exit code is r=
>eserved to mean "the process is still running"=2C but that is easily avoide=
>d by using Wait first. I have seen code get confused by this though. Wait a=
>lso accepts a timeout=2C 32bit unsigned milliseconds=2C including 0 and inf=
>inity=2C so also can be used to poll. Win32 also defines the exit code to b=
>e 32bits=2C whereas Posix only allows for 8 bits which can be an interop pr=
>oblem. Perl on Win32 truncates exit codes to 8 bits=2C very bad. Unhandled =
>exceptions end up as "large" exit codes.<BR>
>&nbsp=3B<BR>
>Anyway...<BR>
>&nbsp=3B<BR>
>The problem with the polling approach=2C at least part of the problem=2C is=
> that if the child process isn't done when waitpid is first called=2C but f=
>inishes before sleep(whatever value) ends=2C we will still sleep for the fu=
>ll "whatever value". You only really want to sleep until the child process =
>is done=2C and no longer.<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>Making just the first sleep shorter might be a good idea.<BR>
>You know=2C to handle processes that are short-lived=2C but not "zero" live=
>d.<BR>
>("zero" being the amount of time it takes for the code to proceed from fork=
>/exec to waitpid=2C surely much smaller than a small sleep() but longer tha=
>n no sleep).<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>Calling just waitpid(flags =3D 0) could deadlock if=2C for example=2C a par=
>ent thread is writing to a child's stdin=2C and the child won't finish unti=
>l the parent has written all that it needs to. The parent and child process=
>=2C er=2C other threads in the parent process=2C need to be allowed to run =
>concurrently=2C for the sake of at least some reasonable scenarios.<BR>
>&nbsp=3B<BR>
>With kernelthreads=2C the implementation of waitpid knows about threads and=
> will itself=2C in a sense=2C do the poll/sleep=2C but not exactly that -- =
>it won't sleep beyond the child process finishing.<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>Hopefully this makes sense and lets more folks understand the problem.<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>What you can do=2C of course=2C is like:<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>if kernelthreads<BR>
>&nbsp=3B waitpid(flags =3D 0)<BR>
>else<BR>
>&nbsp=3B while (waitpid(flags =3D nohang) !=3D 0)<BR>
>&nbsp=3B&nbsp=3Bsleep<BR>
>&nbsp=3B<BR>
>&nbsp=3B<BR>
>and that is basically what the code looks like now.<BR>
>&nbsp=3B<BR>
>The part "if kernelthreads" I propose be "if SchedulerPosix.DoesWaitPidYiel=
>d()"<BR>
>though a really direct "if Thread or Scheduler.KernelThreads" might be reas=
>onable.<BR>
>Up to folks then to decide what that implies..<BR>
>&nbsp=3B<BR>
>&nbsp=3B- Jay<BR><BR><BR>&gt=3B Date: Fri=2C 2 Jan 2009 11:27:24 +0100<BR>&=
>gt=3B From: wagner at elegosoft.com<BR>&gt=3B To: m3devel at elegosoft.com<BR>&gt=
>=3B Subject: Re: [M3devel] variations of waitpid..?<BR>&gt=3B <BR>&gt=3B Qu=
>oting Tony Hosking &lt=3Bhosking at cs.purdue.edu&gt=3B:<BR>&gt=3B <BR>&gt=3B =
>&gt=3B If someone uses waitpid they get what they paid for.<BR>&gt=3B It is=
> so long ago that we wrote those sysutils routines...<BR>&gt=3B They have o=
>nly ever be used in simple command line utilities (like cm3)<BR>&gt=3B with=
>out much concurrency=2C I think. If there is potential for deadlocks<BR>&gt=
>=3B and bad performance=2C we should at least document that in the interfac=
>es.<BR>&gt=3B <BR>&gt=3B I am not up-to-date wrt. the M3 system interfaces =
>and threads<BR>&gt=3B implementation: is there a way for a thread to wait f=
>or the exit code<BR>&gt=3B of another process without blocking other thread=
>s? If so=2C I'll adapt<BR>&gt=3B the sysutils code... If not=2C can we intr=
>oduce such an interface in<BR>&gt=3B m3core/libm3?<BR>&gt=3B <BR>&gt=3B Ola=
>f<BR>&gt=3B <BR>&gt=3B &gt=3B On 1 Jan 2009=2C at 06:24=2C Jay wrote:<BR>&g=
>t=3B &gt=3B<BR>&gt=3B &gt=3B&gt=3B<BR>&gt=3B &gt=3B&gt=3B You mean=2C this =
>function is easy to misuse?<BR>&gt=3B &gt=3B&gt=3B&gt=3B People who declare=
> their own &lt=3B*EXTERNAL*&gt=3B<BR>&gt=3B &gt=3B&gt=3B Like waitpid expos=
>ed from m3core?<BR>&gt=3B &gt=3B&gt=3B<BR>&gt=3B &gt=3B&gt=3B waitpid is al=
>ready easy to misuse=2C on a userthread system=2C leading <BR>&gt=3B &gt=3B=
>&gt=3B to possible (though I think rare) deadlock.<BR>&gt=3B &gt=3B&gt=3B I=
>t is easy to misuse on pthreads=2C lead "just" to bad performance=2C <BR>&g=
>t=3B &gt=3B&gt=3B and in fact I believe cm3 is doing this=2C via sysutils.<=
>BR>&gt=3B &gt=3B&gt=3B This at least guides you between two patterns of use=
>=2C and fix the <BR>&gt=3B &gt=3B&gt=3B perf of cm3/sysutils.<BR>&gt=3B &gt=
>=3B&gt=3B<BR>&gt=3B &gt=3B&gt=3B On a userthread system=2C waitpid(pid=2C f=
>lags =3D 0) waits for the child <BR>&gt=3B &gt=3B&gt=3B process=2C with all=
> parent threads suspended.<BR>&gt=3B &gt=3B&gt=3B Generally I doubt the chi=
>ld depends on parent threads progressing=2C <BR>&gt=3B &gt=3B&gt=3B but=2C =
>yeah=2C that could deadlock=2C like if a parent thread is waiting <BR>&gt=
>=3B &gt=3B&gt=3B to a file or stdin of the child=2C or reading a child's st=
>dout.<BR>&gt=3B &gt=3B&gt=3B<BR>&gt=3B &gt=3B&gt=3B The various uses do wai=
>tpid(pid=2C flags =3D nohang) and then sleep and <BR>&gt=3B &gt=3B&gt=3B tr=
>y again.<BR>&gt=3B &gt=3B&gt=3B<BR>&gt=3B &gt=3B&gt=3B pthreads just uses w=
>aitpid(pid=2C flags =3D 0) and all threads keep running<BR>&gt=3B <BR>&gt=
>=3B <BR>&gt=3B <BR>&gt=3B -- <BR>&gt=3B Olaf Wagner -- elego Software Solut=
>ions GmbH<BR>&gt=3B Gustav-Meyer-Allee 25 / Geb=E4ude 12=2C 13355 Berlin=2C=
> Germany<BR>&gt=3B phone: +49 30 23 45 86 96 mobile: +49 177 2345 869 fax: =
>+49 30 23 45 86 95<BR>&gt=3B http://www.elegosoft.com | Gesch=E4ftsf=FChrer=
>: Olaf Wagner | Sitz: Berlin<BR>&gt=3B Handelregister: Amtsgericht Charlott=
>enburg HRB 77719 | USt-IdNr: DE163214194<BR>&gt=3B <BR><BR></body>
></html>=
>
>--_e7fe8011-c1fe-42de-b7fb-8ae54fa172fc_--



More information about the M3devel mailing list