<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Tahoma
}
--></style>
</head>
<body class='hmmessage'>
We should look at what C does on the various targets.<BR>
NT already has a specific easy simple mechanism/policy here, that Modula-3<BR>
should follow but doesn't.<BR>
<BR>
<BR>
Specifically, C code in NT can overflow its stack, but it is caught "right away"<BR>
and not via random corruption of neighboring memory.<BR>
Perhaps, hopefully, C code in other targets is this good.<BR>
<BR>
<BR>
If not, well, the NT mechanism makes a lot of sense for all targets:<BR>
if function has less than a page of locals, do nothing special.<BR>
if function has more than a page of locals, touch each page, in order,<BR>
at the start of the function.<BR>
"page" is hardware specific, however I believe all of our targets have 4K or 8K pages,<BR>
and if you just hardcode 4K, that works, with a slight slight pessimization<BR>
on targets with 8K pages (e.g. I think Sparc and IA64, but I'd have to dig around).<BR>
<BR>
<BR>
Actually NT reserves like two pages at the end of the stack.<BR>
One to trigger stack overflow exception, one to give an exception handler<BR>
a little bit of room to deal with it, such as by capturing a dump of some sort,<BR>
and exiting, and then if the exception handler uses more than its page, the process<BR>
is I think terminated.<BR>
<BR>
<BR>
FURTHERMORE, this is a detail that SHOULD be handled by the existing gcc backend.<BR>
I don't know if it does, but it ought to.<BR>
<BR>
<BR>
But the NT backend does not, and should.<BR>
It's possible this problem has regressed with the 1K jmpbuf I put in.<BR>
(I'm very very inclined to shrink that down either to precise levels or approximate<BR>
but usually smaller levels. m3core did have an assert at startup that the size was large enough.<BR>
Still, we can't get the 128 bit alignment some targets prefer (powerpc) or require (hppa64).<BR>
Still, the alloca solution is better, if done right.)<BR>
<BR>
<BR>
- Jay<BR> <BR>> Date: Sun, 20 Feb 2011 20:21:27 -0600<BR>> From: rodney_bates@lcwb.coop<BR>> To: m3devel@elegosoft.com<BR>> Subject: Re: [M3devel] SEGV mapping to RuntimeError<BR>> <BR>> <BR>> <BR>> On 02/20/2011 12:37 PM, Mika Nystrom wrote:<BR>> > On a 64-bit machine, at least, there ought to be enough virtual<BR>> > memory that you could just have a gap between thread stacks big<BR>> > enough to allow for a protection area larger than the largest possible<BR>> > (implementation-defined) activation record, no? I know I've run into<BR>> > trouble with very large activation records in the past (and not because<BR>> > I was running out of stack space, either).<BR>> ><BR>> > Or at least a procedure with a very large activation record (or<BR>> > a procedure calling it) could be required to call some sort of check<BR>> > routine "EnoughStackSpaceRemaining()" before starting to scribble<BR>> > on the activation record?<BR>> <BR>> Hmm, I like this idea. It would introduce normal-case runtime overhead<BR>> only for such procedures, and these are likely rare. Also, assuming the procedure<BR>> actually uses very much of its large AR, it should also have enough computation<BR>> time to wash out the stack check overhead.<BR>> <BR>> ><BR>> > Also the end of the activation record must be written to at least once,<BR>> > or else the memory protection won't be triggered.<BR>> ><BR>> <BR>> I was thinking (as an alternative mechanism) of having the compiler intentionally<BR>> add enough artificial write(s) as necessary to ensure storing within the<BR>> red zone, and not just beyond it. This seems trickier to get right and<BR>> harder to distinguish after the fact from a NIL dereference.<BR>> <BR>> > In any case if this is done properly the same mechanism I proposed for<BR>> > SIGSEGV ought to be able to catch stack overflow, no? Well, as long as<BR>> > signals are delivered on a separate stack. If signals are delivered on<BR>> > the same stack, the signal handler would get nastier, it would have to<BR>> > make space through some manipulations (maybe temporarily unporotecting<BR>> > the redzone page?) for its own purposes... but I don't see why it<BR>> > couldn't be done.<BR>> ><BR>> > Not sure why I'm getting SIGILL... maybe I am getting my signal handler<BR>> > activated inside the redzone page because of a difference in signal<BR>> > handling..? I remember reading something about sigaltstack...<BR>> ><BR>> > I would of course love to be able to recover from stack overflow, too.<BR>> > In some sense, since it's a generally unknown limit, it's even less of<BR>> > a fatal error than a NIL dereference (hence makes even more sense to<BR>> > catch it).<BR>> <BR>> I think this would be a nice mechanism to have available. It would have to<BR>> be used with some care. In any case, it would be really nice and more<BR>> frequently so, to at least have runtime error messages that distinguished<BR>> stack overflow from NIL deref.<BR>> <BR>> ><BR>> > Mika<BR>> ><BR>> > "Rodney M. Bates" writes:<BR>> >> I am pretty sure the cases I've seen are SIGSEGV on LINUXLIBC6 and AMD64_LINUX.<BR>> >> Probably a fully protected guard page at the end of the stack. This technique<BR>> >> always worries me a bit because a procedure with a really big activation record<BR>> >> could jump right past it. Probably it would almost always access the first page<BR>> >> of the big area before storing anything into later pages.<BR>> >><BR>> >> On 02/19/2011 05:27 PM, Mika Nystrom wrote:<BR>> >>> Ah, yes, stack protection.<BR>> >>><BR>> >>> Do you know if it's a SIGSEGV, not a SIGBUS? I know I have seen SIGILL on Macs.<BR>> >>><BR>> >>> Hmm, I get SIGILL on AMD64_FREEBSD as well:<BR>> >>><BR>> >>> time ../AMD64_FREEBSD/stubexample<BR>> >>> M-Scheme Experimental<BR>> >>> LITHP ITH LITHENING.<BR>> >>>> (define (f a) (+ (f (+ a 1)) (f (+ a 2))))<BR>> >>> f<BR>> >>>> (f 0)<BR>> >>> Illegal instruction<BR>> >>> 3.847u 0.368s 0:13.32 31.5% 2160+284478k 0+0io 0pf+0w<BR>> >>><BR>> >>> What absolutely must not happen, of course, is that the runtime hangs<BR>> >>> while executing only safe code...<BR>> >>><BR>> >>> Mika<BR>> >>><BR>> >>> "Rodney M. Bates" writes:<BR>> >>>> I know of one other place the compilers rely on hardware memory protection<BR>> >>>> to detect a checked runtime error, and that is stack overflow. This won't<BR>> >>>> corrupt anything, but is hard to distinguish from dereferencing NIL.<BR>> >>>> This could probably be distinguished after the fact by some low-level,<BR>> >>>> target-dependent code. I have found it by looking at assembly code at<BR>> >>>> the point of failure--usually right after a stack pointer push.<BR>> >>>><BR>> >>>> Detecting this via compiler-generated checks would probably be more<BR>> >>>> extravagant than many other checks, as it is so frequent. I am not<BR>> >>>> aware of any really good solution to this in any implementation of any<BR>> >>>> language.<BR>> >>>><BR>> >>>> On 02/19/2011 02:38 PM, Mika Nystrom wrote:<BR>> >>>>> Jay, sometimes I wonder about you: this is a Modula-3 mailing list,<BR>> >>>>> you know!<BR>> >>>>><BR>> >>>>> "Corrupting the heap" is something that can only happen as a result of<BR>> >>>>> an unchecked runtime error. Unchecked runtime errors cannot happen in<BR>> >>>>> modules not marked UNSAFE.<BR>> >>>>><BR>> >>>>> SEGV is, however, used by the CM3 implementation (and its predecessors)<BR>> >>>>> to signal a certain kind of *checked* runtime error, namely, the<BR>> >>>>> dereferencing of a NIL reference. Correct me if I am wrong, but an<BR>> >>>>> attempt to dereference NIL is not going to leave the heap corrupted?<BR>> >>>>><BR>> >>>>> And if you stick to safe code, the only SEGVs I think you get in the<BR>> >>>>> current CM3 are ones from NIL dereferences.<BR>> >>>>><BR>> >>>>> Hence, as long as you stick with safe code, the only time the code I<BR>> >>>>> checked in earlier gets triggered is for NIL dereferences, which should<BR>> >>>>> never corrupt the heap. So SEGV is not sometimes, but in fact always<BR>> >>>>> recoverable.<BR>> >>>>><BR>> >>>>> :-)<BR>> >>>>><BR>> >>>>> Mika<BR>> >>>>><BR>> >>>>> P.S. the bit above "if you stick to safe code": if you actually program in<BR>> >>>>> Modula-3 you almost never use UNSAFE. I went through my repository and<BR>> >>>>> I have 40 modules using UNSAFE out of a total of 4,559. Furthermore,<BR>> >>>>> many of the UNSAFE modules are glue code to Fortran routines, which<BR>> >>>>> could relatively easily be verified to be safe in the Modula-3 sense.<BR>> >>>>> Almost all what remains is glue to some C library, which wouldn't be<BR>> >>>>> necessary if the rest of the world would wake up out of the dark ages, but<BR>> >>>>> I don't have the time to rewrite every single library from scratch myself.<BR>> >>>>><BR>> >>>>><BR>> >>>>> Jay K writes:<BR>> >>>>>> --_a2a24b92-3b4c-456e-ab1b-c3f5e912854f_<BR>> >>>>>> Content-Type: text/plain; charset="iso-8859-1"<BR>> >>>>>> Content-Transfer-Encoding: quoted-printable<BR>> >>>>>><BR>> >>>>>><BR>> >>>>>> Letting any code run after a SIGSEGV is dubious.<BR>> >>>>>> Imagine the heap is corrupted.<BR>> >>>>>> And then you run more code.<BR>> >>>>>> And the code happens to call malloc.<BR>> >>>>>> Or printf to log something.<BR>> >>>>>> =20<BR>> >>>>>> I suppose there might be an application that maps memory<BR>> >>>>>> gradually=2C as pieces of a buffer are hit. Might.<BR>> >>>>>> =20<BR>> >>>>>> - Jay<BR>> >>>>>> =20<BR>> >>>>>>> To: m3devel@elegosoft.com<BR>> >>>>>>> Date: Sat=2C 19 Feb 2011 10:29:30 -0800<BR>> >>>>>>> From: mika@async.caltech.edu<BR>> >>>>>>> Subject: [M3devel] SEGV mapping to RuntimeError<BR>> >>>>>>> =20<BR>> >>>>>>> =20<BR>> >>>>>>> Dear m3devel=2C<BR>> >>>>>>> =20<BR>> >>>>>>> For a while it has annoyed me that segmentation violations cause an<BR>> >>>>>>> unconditional program abort. I've changed that now so that (under user<BR>> >>>>>>> threads at least) we instead get a RuntimeError. Here's an example of<BR>> >>>>>>> the mechanism at work in an interactive Scheme environment. Consider<BR>> >>>>>>> the unhelpful interface and module Crash:<BR>> >>>>>>> =20<BR>> >>>>>>> INTERFACE Crash=3B PROCEDURE Me()=3B END Crash.<BR>> >>>>>>> =20<BR>> >>>>>>> MODULE Crash=3B<BR>> >>>>>>> =20<BR>> >>>>>>> PROCEDURE Me() =3D<BR>> >>>>>>> VAR ptr : REF INTEGER :=3D NIL=3B BEGIN<BR>> >>>>>>> ptr^ :=3D 0<BR>> >>>>>>> END Me=3B<BR>> >>>>>>> =20<BR>> >>>>>>> BEGIN END Crash.<BR>> >>>>>>> =20<BR>> >>>>>>> Here's an example of what happens if you now call this from an interactiv=<BR>> >>>>>> e<BR>> >>>>>>> interpreter that catches the exception RuntimeError.E:<BR>> >>>>>>> =20<BR>> >>>>>>> M-Scheme Experimental<BR>> >>>>>>> LITHP ITH LITHENING.<BR>> >>>>>>>> (require-modules "m3")<BR>> >>>>>>> #t<BR>> >>>>>>>> (Crash.Me)<BR>> >>>>>>> EXCEPTION! RuntimeError! Attempt to reference an illegal memory location.<BR>> >>>>>>>> (+ 3 4)=20<BR>> >>>>>>> 7<BR>> >>>>>>>> =20<BR>> >>>>>>> =20<BR>> >>>>>>> I just realized I may have broken pthreads=2C let me go back and double-c=<BR>> >>>>>> heck it.=20<BR>> >>>>>>> runtime/POSIX and thread/POSIX don't refer to the same thing do they...<BR>> >>>>>>> =20<BR>> >>>>>>> Mika<BR>> >>>>>>> =20<BR>> >>>>>> =<BR>> >>>>>><BR>> >>>>>> --_a2a24b92-3b4c-456e-ab1b-c3f5e912854f_<BR>> >>>>>> Content-Type: text/html; charset="iso-8859-1"<BR>> >>>>>> Content-Transfer-Encoding: quoted-printable<BR>> >>>>>><BR>> >>>>>> <html><BR>> >>>>>> <head><BR>> >>>>>> <style><!--<BR>> >>>>>> .hmmessage P<BR>> >>>>>> {<BR>> >>>>>> margin:0px=3B<BR>> >>>>>> padding:0px<BR>> >>>>>> }<BR>> >>>>>> body.hmmessage<BR>> >>>>>> {<BR>> >>>>>> font-size: 10pt=3B<BR>> >>>>>> font-family:Tahoma<BR>> >>>>>> }<BR>> >>>>>> --></style><BR>> >>>>>> </head><BR>> >>>>>> <body class=3D'hmmessage'><BR>> >>>>>> Letting any code run after a SIGSEGV is dubious.<BR><BR>> >>>>>> Imagine the heap =3Bis corrupted.<BR><BR>> >>>>>> And then you run more code.<BR><BR>> >>>>>> And the code happens to call malloc.<BR><BR>> >>>>>> Or printf to log something.<BR><BR>> >>>>>>  =3B<BR><BR>> >>>>>> I suppose there might be an application that maps memory<BR><BR>> >>>>>> gradually=2C as pieces of a buffer are hit. Might.<BR><BR>> >>>>>>  =3B<BR><BR>> >>>>>>  =3B- Jay<BR> =3B<BR><BR>> >>>>>> >=3B To: m3devel@elegosoft.com<BR>>=3B Date: Sat=2C 19 Feb 2011 10:29:3=<BR>> >>>>>> 0 -0800<BR>>=3B From: mika@async.caltech.edu<BR>>=3B Subject: [M3devel]=<BR>> >>>>>> SEGV mapping to RuntimeError<BR>>=3B<BR>>=3B<BR>>=3B Dear m3devel=<BR>> >>>>>> =2C<BR>>=3B<BR>>=3B For a while it has annoyed me that segmentation vi=<BR>> >>>>>> olations cause an<BR>>=3B unconditional program abort. I've changed that =<BR>> >>>>>> now so that (under user<BR>>=3B threads at least) we instead get a Runtim=<BR>> >>>>>> eError. Here's an example of<BR>>=3B the mechanism at work in an interact=<BR>> >>>>>> ive Scheme environment. Consider<BR>>=3B the unhelpful interface and modu=<BR>> >>>>>> le Crash:<BR>>=3B<BR>>=3B INTERFACE Crash=3B PROCEDURE Me()=3B END Cra=<BR>> >>>>>> sh.<BR>>=3B<BR>>=3B MODULE Crash=3B<BR>>=3B<BR>>=3B PROCEDURE Me(=<BR>> >>>>>> ) =3D<BR>>=3B VAR ptr : REF INTEGER :=3D NIL=3B BEGIN<BR>>=3B ptr^ :=3D=<BR>> >>>>>> 0<BR>>=3B END Me=3B<BR>>=3B<BR>>=3B BEGIN END Crash.<BR>>=3B<BR>=<BR>> >>>>>> >=3B Here's an example of what happens if you now call this from an inter=<BR>> >>>>>> active<BR>>=3B interpreter that catches the exception RuntimeError.E:<BR>=<BR>> >>>>>> >=3B<BR>>=3B M-Scheme Experimental<BR>>=3B LITHP ITH LITHENING.<BR>&=<BR>> >>>>>> gt=3B>=3B (require-modules "m3")<BR>>=3B #t<BR>>=3B>=3B (Crash.Me=<BR>> >>>>>> )<BR>>=3B EXCEPTION! RuntimeError! Attempt to reference an illegal memory=<BR>> >>>>>> location.<BR>>=3B>=3B (+ 3 4)<BR>>=3B 7<BR>>=3B>=3B<BR>>=<BR>> >>>>>> =3B<BR>>=3B I just realized I may have broken pthreads=2C let me go back=<BR>> >>>>>> and double-check it.<BR>>=3B runtime/POSIX and thread/POSIX don't refer=<BR>> >>>>>> to the same thing do they...<BR>>=3B<BR>>=3B Mika<BR>>=3B<BR> =<BR>> >>>>>> </body><BR>> >>>>>> </html>=<BR>> >>>>>><BR>> >>>>>> --_a2a24b92-3b4c-456e-ab1b-c3f5e912854f_--<BR>> >>>>><BR>> >>><BR>> ><BR> </body>
</html>