<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Verdana
}
--></style>
</head>
<body class='hmmessage'>
Plus I think I narrowed the problem down to a 30 minute window, not just a day.<BR>
I build like 2:00 and 2:30 on the day of the heap/lock change.<BR>
But granted it might only be revealing some other problem, that was always there or recently introduced or long ago introduced...<BR>
<BR>
- Jay<BR><BR> <BR>
<HR id=stopSpelling>
From: jay.krell@cornell.edu<BR>To: hosking@cs.purdue.edu<BR>CC: m3devel@elegosoft.com<BR>Subject: RE: [M3devel] CM3 5.8 Release Engineering, was Re: back again -- cm3 status worse?<BR>Date: Wed, 23 Sep 2009 03:48:59 +0000<BR><BR>
<STYLE>
.ExternalClass .ecxhmmessage P
{padding:0px;}
.ExternalClass body.ecxhmmessage
{font-size:10pt;font-family:Verdana;}
</STYLE>
I'm "certain" these are ok but I can try without them.<BR>One just changes the command line parameters to rc to a form that works with more toolsets. Rc probably isn't even used with Juno at all. Just put error() in the file to test it.<BR> <BR> <BR>The other passes a struct by pointer instead of by value, through a C translation layer, because if you use the gcc backend, which nobody does, it names the functions wrong for the struct by value case. (gcc gets it right when compiling C).<BR> <BR> <BR>You still aren't understanding me.<BR> <BR>We have a consistent failure before Feb 20, but it is deemed maybe ok.<BR> It was maybe always that way. It is maybe unfinished code. Not heap corruption.<BR> Though we don't know 100% and it does merit some investigation.<BR> <BR>After Feb 20 without @M3nogc we have a "more severe" and actually fairly consistent but not completely consistent failure -- heap corruption.<BR> <BR>After Feb 20 with @M3nogc acts the same as before Feb 20 without @M3nogc.<BR> <BR> <BR> - Jay<BR> <BR>> From: hosking@cs.purdue.edu<BR>> To: hosking@cs.purdue.edu<BR>> Date: Tue, 22 Sep 2009 22:46:30 -0400<BR>> CC: m3devel@elegosoft.com; jay.krell@cornell.edu<BR>> Subject: Re: [M3devel] CM3 5.8 Release Engineering, was Re: back again -- cm3 status worse?<BR>> <BR>> What about these?<BR>> They appear to be Trestle and icon-related...<BR>> 2009-02-18 11:14 jkrell<BR>> <BR>> * m3-libs/m3core/src/win32/WinUser.i3,<BR>> m3-libs/m3core/src/win32/WinUserC.c,<BR>> m3-libs/m3core/src/win32/m3makefile, m3-ui/ui/src/winvbt/ <BR>> WinTrestle.m3:<BR>> <BR>> workaround gcc backend bug that names<BR>> <BR>> <*EXTERNAL WindowFromPoint:WINAPI*><BR>> PROCEDURE WindowFromPoint (Point: POINT): HWND;<BR>> <BR>> WindowFromPoint@4 instead of WindowFromPoint@8<BR>> <BR>> by adding<BR>> <BR>> <*EXTERNAL WinUser__WindowFromPointWorkaround:WINAPI*><BR>> PROCEDURE WindowFromPointWorkaround (VAR Point: POINT): HWND;<BR>> <BR>> HWND __stdcall WinUser__WindowFromPointWorkaround (POINT* Point)<BR>> {<BR>> return WindowFromPoint(*Point);<BR>> }<BR>> <BR>> This lets I386_MINGW (NT386MINGNU) get further.<BR>> <BR>> 2009-02-18 10:51 jkrell<BR>> <BR>> * m3-sys/windowsResources/src/winRes.tmpl:<BR>> <BR>> adapt to MinGW which has windres instead of rc with different <BR>> command line usage; detect MinGW by checking if backend mode is <BR>> integrated backend or not, not great..it should really be informed by <BR>> a variable in the toplevel configuration -- CONFIG_HAS_RC and <BR>> CONFIG_HAS_WINDRES?<BR>> <BR>> <BR>> On 22 Sep 2009, at 22:25, Tony Hosking wrote:<BR>> <BR>> > On 22 Sep 2009, at 21:51, Jay K wrote:<BR>> ><BR>> >> Tony there is something a bit gray that you are missing.<BR>> ><BR>> > Yes, clearly I am missing something.<BR>> ><BR>> >> The behavior with @M3nogc we don't necessarily consider bad/wrong/ <BR>> >> buggy.<BR>> ><BR>> > Right, it just takes GC out of the equation for what might be wrong.<BR>> ><BR>> >> It is a consistent assertion failure. Not an access violation.<BR>> ><BR>> > Good. We can debug that.<BR>> ><BR>> >> It could just be Trestle not being fully supported on Windows.<BR>> >> Olaf says Trestle was never fully ported.<BR>> ><BR>> > I don't know enough about this to say either way.<BR>> ><BR>> >> I'm not sure anyone knows what is missing, and if Juno really <BR>> >> demonstrates that or not.<BR>> >><BR>> >> However, versions before Feb 20 consistently act like current <BR>> >> versions act with @M3nogc.<BR>> >> Before Feb 20 without @M3nogc.<BR>> >> Current with @M3nogc.<BR>> ><BR>> > What does this mean? That pre-2009-02 is just the same as <BR>> > post-2009-02? How does that narrow anything down to that specific <BR>> > time-frame?<BR>> ><BR>> >> What I'd like to see is current without @M3nogc to act just as bad <BR>> >> but no worse than before Feb 20. I think the current behavior <BR>> >> without @M3nogc is worse. It's just "fail vs. no fail".<BR>> ><BR>> > I still don't understand what this says about that particular time- <BR>> > frame.<BR>> ><BR>> >> Now, that is apples and oranges. For example, I relatively <BR>> >> recently changed the default initial allocation size and maybe <BR>> >> incremental allocation sizes. In particular..I forget the exact <BR>> >> details but I think changed from malloc to VirtualAlloc, and <BR>> >> VirtualAlloc allocates in 64K chunks. I guess I should review <BR>> >> that..but that was more recent I think, after Feb 20. I have to <BR>> >> check.<BR>> >> The code was a bit flawed somehow and I improved it somehow. I <BR>> >> forget. Almost everything is subject to rerererereview when there <BR>> >> is a bug, granted.<BR>> >><BR>> >><BR>> >> I agree as well that Feb 20 might have just uncovered a preexisting <BR>> >> problem.<BR>> >><BR>> >><BR>> >> But much is unclear and figuring this out I don't think will be <BR>> >> easy. :(<BR>> ><BR>> > If we have a deterministic failure then it should be easy enough to <BR>> > track down.<BR>> ><BR>> >><BR>> >><BR>> >> - Jay<BR>> >><BR>> >><BR>> >><BR>> >> From: hosking@cs.purdue.edu<BR>> >> To: jay.krell@cornell.edu<BR>> >> Date: Tue, 22 Sep 2009 21:40:27 -0400<BR>> >> CC: m3devel@elegosoft.com<BR>> >> Subject: Re: [M3devel] CM3 5.8 Release Engineering, was Re: back <BR>> >> again -- cm3 status worse?<BR>> >><BR>> >><BR>> >> On 22 Sep 2009, at 08:16, Jay K wrote:<BR>> >><BR>> >> Yes there is fairly definitely a problem on Windows and it dates, I <BR>> >> think, to this change:<BR>> >><BR>> >><BR>> >> 2009-02-16 02:20 hosking<BR>> >> * m3-libs/m3core/src/: Csupport/VAX/dtoa.c, Csupport/big-endian/ <BR>> >> dtoa.c,<BR>> >> Csupport/little-endian/dtoa.c, convert/CConvert.i3,<BR>> >> convert/CConvert.m3, runtime/I386_DARWIN/RTThread.m3,<BR>> >> runtime/common/RTCollector.m3, runtime/common/RTHeapRep.i3,<BR>> >> runtime/common/RTOS.i3, thread/POSIX/ThreadPosix.m3,<BR>> >> thread/PTHREAD/ThreadF.i3, thread/PTHREAD/ThreadPThread.m3,<BR>> >> thread/PTHREAD/ThreadPThreadC.c, thread/PTHREAD/ <BR>> >> ThreadPThreadC.i3,<BR>> >> thread/WIN32/ThreadWin32.m3:<BR>> >> Clean up RTOS.LockHeap/RTOS.UnlockHeap implementations to better <BR>> >> match underlying pthread semantics.<BR>> >> This means that RTOS.WaitHeap must be called while RTOS.LockHeap <BR>> >> is held.<BR>> >> RTOS.BroadcastHeap can be called whether RTOS.LockHeap is held or <BR>> >> not.<BR>> >><BR>> >> I'm not convinced that this change itself broke things, but perhaps <BR>> >> instead exposed the brokenness. In any case, debugging this in the <BR>> >> head will probably be easiest. If we have an example that <BR>> >> deterministically breaks then I think we have a place to start. My <BR>> >> suggestion for now, since it appears to trigger the problem, is to <BR>> >> use @M3nogc.<BR>> >><BR>> >><BR>> ><BR>> <BR> </body>
</html>