<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body dir="auto">
<div>Jay,</div>
<div><br>
</div>
<div>If I'm following you correctly, you are saying that this bug happens when running on 64bit Windows, but not on 32bit, since it applies only to SysWOW64. </div>
<div><br>
</div>
<div>Since the thread test program also misbehaves on 32bit Windows, there must yet be some other bug in the underlying implementation affecting both 32 & 64 bit platforms. </div>
<div><br>
</div>
<div>Thus, the SysWOW64 issue is a 2nd, additional problem for 64bit. </div>
<div><br>
</div>
<div>Has anyone made any progress in solving the 1st, underlying problem affecting both 32/64 bit Windows?</div>
<div><br>
</div>
<div>--Randy<br>
<br>
Sent from my iPhone</div>
<div><br>
On Jan 23, 2014, at 2:58 AM, "Jay K" <<a href="mailto:jay.krell@cornell.edu">jay.krell@cornell.edu</a>> wrote:<br>
<br>
</div>
<blockquote type="cite">
<div><style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style>
<div dir="ltr"> There is a behavior/bug in wow64. <br>
Bing for "wow64 GetThreadContext" "wow64 stack pointer", etc.<br>
<br>
<br>
<br>
SuspendThread / GetThreadContext work like this: <br>
<br>
<br>
32bit processes consist almost entirely of 32bit code. <br>
There is a small amount of 64bit code. <br>
<br>
<br>
If you suspend while running 32bit code, GetThreadContext works. <br>
if you suspend while running 64bit code, GetThreadContext usually but not always works.
<br>
<br>
<br>
64bit code is run en route to syscalls. <br>
For example you call: <br>
1 kernel32!Sleep <br>
2 it calls 32bit NtDelayExecution <br>
3 that calls wow64NtDelayExecutation (via a cross segment "far" jmp or call)
<br>
4 which calls native NtDelayExecuation <br>
<br>
<br>
In between 2 and 3, within 64bit code, the 32bit context is saved. <br>
You can step through it very easily in a debugger. Really. <br>
Where GetThreadContext knows where to get it. <br>
The problem is that saving context is not atomic. <br>
You can suspend while saving context. <br>
<br>
<br>
What to do? <br>
<br>
<br>
scratch/wow64stack contains a program that detects the bug.<br>
I believe it is the basis of a workaround for the bug. <br>
<br>
<br>
Proposal is that in the compiler, for I386_NT/NT386/I386_MINGWIN/I386_CYGWIN/I386_INTERIX platforms,
<br>
not only functions that use exception handling, but also functions that call "extern" functions
<br>
call GetActivation / SetActivation and therein save/set/restore the stack pointer. And garbage collection<br>
use that, if it isn't zero. <br>
Normally it will be zero. <br>
Syscalls do nest -- I can call SendMessage and in my window proc call CreateFile.
<br>
That is why it isn't a set/set-zero pattern (like in the test program). <br>
<br>
<br>
Downside: We would like to get rid of GetActivation / SetActivation, i.e. and reuse efficient C++ exception handling.
<br>
<br>
<br>
Rejected counter proposal: <br>
Don't suspend/gc when running a syscall. <br>
No -- you can Sleep() a while. You can/should-be-able-to suspend and GC during that.
<br>
<br>
<br>
Possible augmentation: if running native, short circuit most of this <br>
<br>
<br>
Rejected counter proposal: Have I386_NT_NATIVE that doesn't do this stuff. Relegate
<br>
compatibility to I386_NT_COMPATIBLE. I don't like having more platforms/targets.
<br>
<br>
<br>
AMD64_NT wouldn't have this stuff. Nor would other hypothetical platforms like ARM32_NT,
<br>
until/unless there is another 32bit platform runable on some similar 64bit platforms.
<br>
<br>
<br>
Performance impact: hypothetically large but probably not noticable. <br>
<br>
<br>
Furthe refinements: It isn't extern/native code per se, it is syscalls.<br>
We could further augment pragmas to discern them. <br>
<br>
<br>
We could leave it to writing "syscall wrappers" and informally (no enforcement) ban direct calls
<br>
to any functions that make syscalls. This is likely too hard to maintain and too unfriendly.
<br>
It pretty viable in m3core, but then also libm3 and Trestle and Qt wrappers, etc...
<br>
<br>
<br>
<br>
Agreed? I'll make the compiler change? <br>
<br>
<br>
Oh, also, not just stack pointer, but other registers, at least non-volatiles?<br>
<br>
<br>
Eventually cooperative suspend will cause this to fall away as a problem..<br>
<br>
<br>
- Jay<br>
<br>
</div>
</div>
</blockquote>
</body>
</html>