[M3devel] possible cygwin createprocess/fork cost measurements..

Jay jayk123 at hotmail.com
Mon Mar 17 15:21:05 CET 2008


I believe cm3 is affected by this.I don't have numbers yet.
 
I propose some fairly obvious/small/simple/safe changes in order to likely achieve a large speed up in NT386GNU.
I am skeptical that existing functions can be changed in a compatible enough way.
 
So I propose, roughly:
 
Add this to Process.i3:
 
PROCEDURE Spawn(cmd: Pathname.T; READONLY params: ARRAY OF TEXT) : T RAISES {OSError.E};(* A restricted form of Create that is much faster on Cygwin. *)
 
The name is very iffy.
It could be in fact not be in the public interface, but merely notice if wd = stdin = stdout = stderr = nil.
It could probably be in be less limited than shown.
Probably all of the parameters are settable, by altering the parent's globals, within a critical section.
Environment certainly is settable.
It is tempting to leave it limited like this though, such as to be implementable perhaps with system.
(It turns out Cygwin system is slower than spawnve; surprising since system is the most limited of the exec/spawn variants -- I think related to it having an implied sh wrapper but the others do not.)The intent is simple and obvious -- some path to spawnve or spawnvpe.
p has path search.
 
On all but Cygwin, this limited Create/Spawn will just call the normal Create. (Even on Win32).
On Cygwin it will call spawnvpe (or spawnve if people really want, but "p" seems "better").
 
Now, in Quake, all the existing exec variants wrap the command line in either sh or cmd or command.com.
Changing that is probably very dangerous, even with an attempt to discern if the wrapper buys anything, on a command line by command line basis.
For example, if all of the characters * ? | < > % $ & ; ( ) are absent from the command, the shell wrapper probably doesn't buy anything and could be removed from existing paths. However that's not true -- for example system("echo foo") depends on a shell wrapper to run the builtin "echo" (at least on Windows, there is no echo.exe).
 
I think there's no choice but to add a new Quake function, spawn, or limited_exec, or fast_exec, or process_runfast, exec_noshell, or something.
Again I'm not sure what to call it, but it'd simply call Process.Spawn, or Process.Create but with right the right parameters to get into the Cygwin fast path.For now I'm going with Process.Spawn and fast_exec.
I hope to have numbers "soon" as to the perf change.
 
Another good option here, that I tried in the past but failed, and is partly not difficult to solve, but also partly, is to implemet Quake exec using Win32 CreateProcess instead of Cygwin spawn/exec. There are at least two sets of problems here. One is that the existing code returns a File.T, and for that there is the Posix and Win32 types, Cygwin uses Posix. You'd have to warp the code somehow here. I gave up on that point without much trying. It's not that much code though. Cygwin is using CreateProcess of course.
The other problem is on the input, the interpretation of the command line. Again this is the value that presently Cygwin provides (albeit sometimes with great cost).
 
Of course another angle is to work on Cygwin to make vfork efficient. It is presently implemented by calling fork.
There is #ifdef'ed code for another path but it appears not enabled.
 
I know polluting the system just for the sake of Cygwin isn't great, however:
 - I expect the win is quite large 
 - "spawn*" is a pretty old thing, nothing new/controversial here, long known as an often viable replacement for fork+exec at least on Windows.
    It's in msvc*.dll for example.
 
There may even be wins to be had on other Posix systems by avoiding the sh wrapper?
 
"batching" where cm3cg is run once per directory seems like a very good idea and worth trying; the problem is, that still leaves the assembler.
Perhaps the assembler could be linked in statically to cm3cg? Probably, but not particularly easily and probably unpopular upstream...
Unless maybe some nice gcc perf gains would be demonstrated?
 
 - Jay


From: jayk123 at hotmail.comTo: m3devel at elegosoft.comSubject: possible cygwin createprocess/fork cost measurements..Date: Mon, 17 Mar 2008 10:14:50 +0000


 I ran some mostly scientific measures of Cygwin.  On one machine, no reboots, one OS, one set of files. x86, single proc, Windows 2000 (I'll go back to XP soon).  It shows that..well, at least that wrapping Cygwin processes with sh is VERY expensive.  Like, the data isn't yet complete, but this could cut building Cygwin libm3  from around 100 seconds to around 20 seconds. Not counting the Modula-3 front end time.  Just cm3cg+as. cd libm3\NT386GNU  having already built successfully, all the *.ic *.mc files are present   cm3cg not wrapped with sh (F1)  Repeated runs.   28 seconds (other stuff running on machine)    16 seconds   13 seconds (13.?)   13.8 seconds   14.01 seconds   13.3 seconds    now add the -o flag   13.64 seconds   14.07 seconds    now without echoing   13.22 seconds   13.18 seconds    cm3cg wrapped with sh (F2)  51 seconds  51.35 seconds  51.19 seconds  50.88 seconds   now add the -o flag  51.76 seconds    now without echoing  51.05 seconds  These runs did NOT have -o flags, but subsequent runs with -o were about the same.  I added -o so I could run the as variations.  now the same with .s  note that due to the way the above worked, I just have *.s files, and  not the usual *.is and *.ms    as not wrapped with sh (F3)   5.6 seconds   5.28 seconds     now remove echo   5.08 seconds   5.08 seconds   5.04 seconds    forgot -o flag, oh well, enough data  as wrapped with sh (F4)   43 seconds   43.56 seconds    forgot -o flag, oh well, enough data   What is not yet confirmed is:   1) Does cm3 wrap everything with sh?   2) Does calling m3cg/as from cm3 have these costs?  Very clear:    Wrapping stuff with sh on Cygwin is expensive!    Actions:    Confirm this cost is being paid by cm3.     Either:      1) implement some "batch modes" in cm3 and/or cm3cg      2) or maybe, um, just make sure that cm3 does not wrap with sh, and        if cm3 itself causes this slowdown, because of how Cygwin works, try        interposing a small Win32 helper app. I think Cygwin handles runnig        Win32 apps and being run from Win32 apps differently than Cygwin running        Cygwin -- i.e. not slowly. I'll see. Could be that creating twice the number       of processes, in order to avoid Cygwin running Cygwin, could be faster. Not yet known.   Maybe use system() instead vfork() + exec()? Odd, though, vfork instead of fork is supposed to help.  Here is the test code, you edit it to run one case or another:   @if not "%1" == "" goto :%1 at rem \cygwin\bin\time cmd /c %~f0 F1 at rem \cygwin\bin\time cmd /c %~f0 F2 at rem \cygwin\bin\time cmd /c %~f0 F3@\cygwin\bin\time cmd /c %~f0 F4 at goto :eof  :F1 at echo off at del *s *ofor %%a in (*.ic *.mc) do cm3cg -quiet %%a -o %%a.s"goto :eof  :F2 at echo off at del *s *ofor %%a in (*.ic *.mc) do sh -c "cm3cg -quiet %%a -o %%a.s"goto :eof  :F3 at del *o at echo offfor %%a in (*.s) do as %%agoto :eof  :F4 at del *ofor %%a in (*.s) do sh -c "as %%a"goto :eof  - Jay

Helping your favorite cause is as easy as instant messaging. You IM, we give. Learn more. 
_________________________________________________________________
Connect and share in new ways with Windows Live.
http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20080317/3760eda4/attachment-0002.html>


More information about the M3devel mailing list