[M3devel] possible cygwin createprocess/fork cost measurements..
Jay
jayk123 at hotmail.com
Mon Mar 17 15:21:05 CET 2008
I believe cm3 is affected by this.I don't have numbers yet.
I propose some fairly obvious/small/simple/safe changes in order to likely achieve a large speed up in NT386GNU.
I am skeptical that existing functions can be changed in a compatible enough way.
So I propose, roughly:
Add this to Process.i3:
PROCEDURE Spawn(cmd: Pathname.T; READONLY params: ARRAY OF TEXT) : T RAISES {OSError.E};(* A restricted form of Create that is much faster on Cygwin. *)
The name is very iffy.
It could be in fact not be in the public interface, but merely notice if wd = stdin = stdout = stderr = nil.
It could probably be in be less limited than shown.
Probably all of the parameters are settable, by altering the parent's globals, within a critical section.
Environment certainly is settable.
It is tempting to leave it limited like this though, such as to be implementable perhaps with system.
(It turns out Cygwin system is slower than spawnve; surprising since system is the most limited of the exec/spawn variants -- I think related to it having an implied sh wrapper but the others do not.)The intent is simple and obvious -- some path to spawnve or spawnvpe.
p has path search.
On all but Cygwin, this limited Create/Spawn will just call the normal Create. (Even on Win32).
On Cygwin it will call spawnvpe (or spawnve if people really want, but "p" seems "better").
Now, in Quake, all the existing exec variants wrap the command line in either sh or cmd or command.com.
Changing that is probably very dangerous, even with an attempt to discern if the wrapper buys anything, on a command line by command line basis.
For example, if all of the characters * ? | < > % $ & ; ( ) are absent from the command, the shell wrapper probably doesn't buy anything and could be removed from existing paths. However that's not true -- for example system("echo foo") depends on a shell wrapper to run the builtin "echo" (at least on Windows, there is no echo.exe).
I think there's no choice but to add a new Quake function, spawn, or limited_exec, or fast_exec, or process_runfast, exec_noshell, or something.
Again I'm not sure what to call it, but it'd simply call Process.Spawn, or Process.Create but with right the right parameters to get into the Cygwin fast path.For now I'm going with Process.Spawn and fast_exec.
I hope to have numbers "soon" as to the perf change.
Another good option here, that I tried in the past but failed, and is partly not difficult to solve, but also partly, is to implemet Quake exec using Win32 CreateProcess instead of Cygwin spawn/exec. There are at least two sets of problems here. One is that the existing code returns a File.T, and for that there is the Posix and Win32 types, Cygwin uses Posix. You'd have to warp the code somehow here. I gave up on that point without much trying. It's not that much code though. Cygwin is using CreateProcess of course.
The other problem is on the input, the interpretation of the command line. Again this is the value that presently Cygwin provides (albeit sometimes with great cost).
Of course another angle is to work on Cygwin to make vfork efficient. It is presently implemented by calling fork.
There is #ifdef'ed code for another path but it appears not enabled.
I know polluting the system just for the sake of Cygwin isn't great, however:
- I expect the win is quite large
- "spawn*" is a pretty old thing, nothing new/controversial here, long known as an often viable replacement for fork+exec at least on Windows.
It's in msvc*.dll for example.
There may even be wins to be had on other Posix systems by avoiding the sh wrapper?
"batching" where cm3cg is run once per directory seems like a very good idea and worth trying; the problem is, that still leaves the assembler.
Perhaps the assembler could be linked in statically to cm3cg? Probably, but not particularly easily and probably unpopular upstream...
Unless maybe some nice gcc perf gains would be demonstrated?
- Jay
From: jayk123 at hotmail.comTo: m3devel at elegosoft.comSubject: possible cygwin createprocess/fork cost measurements..Date: Mon, 17 Mar 2008 10:14:50 +0000
I ran some mostly scientific measures of Cygwin. On one machine, no reboots, one OS, one set of files. x86, single proc, Windows 2000 (I'll go back to XP soon). It shows that..well, at least that wrapping Cygwin processes with sh is VERY expensive. Like, the data isn't yet complete, but this could cut building Cygwin libm3 from around 100 seconds to around 20 seconds. Not counting the Modula-3 front end time. Just cm3cg+as. cd libm3\NT386GNU having already built successfully, all the *.ic *.mc files are present cm3cg not wrapped with sh (F1) Repeated runs. 28 seconds (other stuff running on machine) 16 seconds 13 seconds (13.?) 13.8 seconds 14.01 seconds 13.3 seconds now add the -o flag 13.64 seconds 14.07 seconds now without echoing 13.22 seconds 13.18 seconds cm3cg wrapped with sh (F2) 51 seconds 51.35 seconds 51.19 seconds 50.88 seconds now add the -o flag 51.76 seconds now without echoing 51.05 seconds These runs did NOT have -o flags, but subsequent runs with -o were about the same. I added -o so I could run the as variations. now the same with .s note that due to the way the above worked, I just have *.s files, and not the usual *.is and *.ms as not wrapped with sh (F3) 5.6 seconds 5.28 seconds now remove echo 5.08 seconds 5.08 seconds 5.04 seconds forgot -o flag, oh well, enough data as wrapped with sh (F4) 43 seconds 43.56 seconds forgot -o flag, oh well, enough data What is not yet confirmed is: 1) Does cm3 wrap everything with sh? 2) Does calling m3cg/as from cm3 have these costs? Very clear: Wrapping stuff with sh on Cygwin is expensive! Actions: Confirm this cost is being paid by cm3. Either: 1) implement some "batch modes" in cm3 and/or cm3cg 2) or maybe, um, just make sure that cm3 does not wrap with sh, and if cm3 itself causes this slowdown, because of how Cygwin works, try interposing a small Win32 helper app. I think Cygwin handles runnig Win32 apps and being run from Win32 apps differently than Cygwin running Cygwin -- i.e. not slowly. I'll see. Could be that creating twice the number of processes, in order to avoid Cygwin running Cygwin, could be faster. Not yet known. Maybe use system() instead vfork() + exec()? Odd, though, vfork instead of fork is supposed to help. Here is the test code, you edit it to run one case or another: @if not "%1" == "" goto :%1 at rem \cygwin\bin\time cmd /c %~f0 F1 at rem \cygwin\bin\time cmd /c %~f0 F2 at rem \cygwin\bin\time cmd /c %~f0 F3@\cygwin\bin\time cmd /c %~f0 F4 at goto :eof :F1 at echo off at del *s *ofor %%a in (*.ic *.mc) do cm3cg -quiet %%a -o %%a.s"goto :eof :F2 at echo off at del *s *ofor %%a in (*.ic *.mc) do sh -c "cm3cg -quiet %%a -o %%a.s"goto :eof :F3 at del *o at echo offfor %%a in (*.s) do as %%agoto :eof :F4 at del *ofor %%a in (*.s) do sh -c "as %%a"goto :eof - Jay
Helping your favorite cause is as easy as instant messaging. You IM, we give. Learn more.
_________________________________________________________________
Connect and share in new ways with Windows Live.
http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20080317/3760eda4/attachment-0002.html>
More information about the M3devel
mailing list