[M3devel] SEGV mapping to RuntimeError

Jay K jay.krell at cornell.edu
Mon Feb 21 13:05:33 CET 2011


Hm. Maybe we just need to pass -fstack-check to the backend.
Needs experimentation...

 - Jay


From: jay.krell at cornell.edu
To: mika at async.caltech.edu; rodney_bates at lcwb.coop
Date: Mon, 21 Feb 2011 12:00:14 +0000
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] SEGV mapping to RuntimeError








Ok, I'm disappointed to report that I did some quick checking on Darwin/x86, Linux/x86, Solaris/sparc32/x86
and none of them seem to be smart about this.
Er, less they were being really smart.

int F1(void)
{
volatile char a[4000];
return a[3999];
}

int F2(void)
{
volatile char a[6000];
return a[5999];
}

int F4(void)
{
volatile char a[60000];
return a[59999];
}

I guess I should redo all the tests but accessing [0] also/instead.

So, yes, we should definitely consider doing something.
However...however, the frontend doesn't know which locals wiil be optimized away.
The backend is setup to do this...but it is only used for NT systems, darn.


jbook2:gcc jay$ grep CHECK_STACK_LIMIT */*/*/*
gcc/config/i386/cygming.h:#define CHECK_STACK_LIMIT 4000
gcc/config/i386/cygwin.asm:   than CHECK_STACK_LIMIT bytes in one go.  Touching the stack at 4K
gcc/config/i386/i386-interix.h:#define CHECK_STACK_LIMIT 0x1000
gcc/config/i386/i386.c:#ifndef CHECK_STACK_LIMIT
gcc/config/i386/i386.c:#define CHECK_STACK_LIMIT (-1)
gcc/config/i386/i386.c:      && (! TARGET_STACK_PROBE || allocate < CHECK_STACK_LIMIT))
gcc/config/i386/i386.c:  else if (! TARGET_STACK_PROBE || allocate < CHECK_STACK_LIMIT)
gcc/config/i386/i386.c:         && (! TARGET_STACK_PROBE || allocate < CHECK_STACK_LIMIT)))
gcc/config/i386/i386.md:#ifndef CHECK_STACK_LIMIT
gcc/config/i386/i386.md:#define CHECK_STACK_LIMIT 0
gcc/config/i386/i386.md:  if (CHECK_STACK_LIMIT && CONST_INT_P (operands[1])
gcc/config/i386/i386.md:      && INTVAL (operands[1]) < CHECK_STACK_LIMIT)
jbook2:gcc jay$ grep CHECK_STACK_LIMIT */*/*/*/*
jbook2:gcc jay$ grep TARGET_STACK_PROBE *
jbook2:gcc jay$ grep TARGET_STACK_PROBE */*
gcc/ChangeLog:    * config/i386/i386.c (ix86_expand_prologue) [TARGET_STACK_PROBE]:
gcc/ChangeLog-2005:    (TARGET_TLS_DIRECT_SEG_REFS, TARGET_STACK_PROBE)
jbook2:gcc jay$ grep TARGET_STACK_PROBE */*/*
jbook2:gcc jay$ grep TARGET_STACK_PROBE */*/*/*
gcc/config/i386/i386.c:      && (! TARGET_STACK_PROBE || allocate < CHECK_STACK_LIMIT))
gcc/config/i386/i386.c:  else if (! TARGET_STACK_PROBE || allocate < CHECK_STACK_LIMIT)
gcc/config/i386/i386.c:         && (! TARGET_STACK_PROBE || allocate < CHECK_STACK_LIMIT)))
gcc/config/i386/i386.md:  "!TARGET_64BIT && TARGET_STACK_PROBE"
gcc/config/i386/i386.md:  "TARGET_64BIT && TARGET_STACK_PROBE"
gcc/config/i386/i386.md:  "TARGET_STACK_PROBE"

 - Jay

From: jay.krell at cornell.edu
To: mika at async.caltech.edu; rodney_bates at lcwb.coop
Date: Sun, 20 Feb 2011 22:38:19 +0000
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] SEGV mapping to RuntimeError








Probably for this reason:
There is a requirement on NT that stack pages be touched in order.
Functions with locals totally more than 4K call _chkstk (aka _alloca) to allocate
their stack, instead of the usual register subtraction. It contains a loop that touches a byte every 4K.
Otherwise, if you have lots of functions with small frames, the stack is touched
by virtue of the call function pushing the return address.
Modula-3 has long failed to uphold this contract, and still does.
I've never seen it clearly documented, but you can see it is what the C compiler does.


I had thought there were other reasons for this behavior.
I thought you actually get an exception of the stack is touched out of order ("the first time").
But I think I did an experiment long ago with Modula-3 and there was no exception.


I don't know about other platforms.

I've also seen evidence in gcc and/or Target.i3 that compiler writers know well about
such mechanisms -- there being a constant to set as to what size locals trigger
special behavior. But m3back doesn't do anything here.

Probably we could use _chkstk unconditonally as well -- need to see what it does
for numbers smaller than 4K. But that'd be a deoptimization in the common case,
and deoptimizing only as necessary should be easy enough.


 - Jay


> To: rodney_bates at lcwb.coop
> Date: Sun, 20 Feb 2011 10:37:46 -0800
> From: mika at async.caltech.edu
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] SEGV mapping to RuntimeError
> 
> 
> On a 64-bit machine, at least, there ought to be enough virtual
> memory that you could just have a gap between thread stacks big
> enough to allow for a protection area larger than the largest possible
> (implementation-defined) activation record, no?  I know I've run into
> trouble with very large activation records in the past (and not because
> I was running out of stack space, either).
> 
> Or at least a procedure with a very large activation record (or
> a procedure calling it) could be required to call some sort of check
> routine "EnoughStackSpaceRemaining()" before starting to scribble
> on the activation record?
> 
> Also the end of the activation record must be written to at least once,
> or else the memory protection won't be triggered.  
> 
> In any case if this is done properly the same mechanism I proposed for
> SIGSEGV ought to be able to catch stack overflow, no?  Well, as long as
> signals are delivered on a separate stack.  If signals are delivered on
> the same stack, the signal handler would get nastier, it would have to
> make space through some manipulations (maybe temporarily unporotecting
> the redzone page?) for its own purposes... but I don't see why it
> couldn't be done.
> 
> Not sure why I'm getting SIGILL... maybe I am getting my signal handler
> activated inside the redzone page because of a difference in signal
> handling..?  I remember reading something about sigaltstack...
> 
> I would of course love to be able to recover from stack overflow, too.
> In some sense, since it's a generally unknown limit, it's even less of
> a fatal error than a NIL dereference (hence makes even more sense to
> catch it).
> 
>      Mika
> 
> "Rodney M. Bates" writes:
> >I am pretty sure the cases I've seen are  SIGSEGV on LINUXLIBC6 and AMD64_LINUX.
> >Probably a fully protected guard page at the end of the stack.  This technique
> >always worries me a bit because a procedure with a really big activation record
> >could jump right past it.  Probably it would almost always access the first page
> >of the big area before storing anything into later pages.
> >
> >On 02/19/2011 05:27 PM, Mika Nystrom wrote:
> >> Ah, yes, stack protection.
> >>
> >> Do you know if it's a SIGSEGV, not a SIGBUS?  I know I have seen SIGILL on Macs.
> >>
> >> Hmm, I get SIGILL on AMD64_FREEBSD as well:
> >>
> >> time ../AMD64_FREEBSD/stubexample
> >> M-Scheme Experimental
> >> LITHP ITH LITHENING.
> >>> (define (f a) (+ (f (+ a 1)) (f (+ a 2))))
> >> f
> >>> (f 0)
> >> Illegal instruction
> >> 3.847u 0.368s 0:13.32 31.5%     2160+284478k 0+0io 0pf+0w
> >>
> >> What absolutely must not happen, of course, is that the runtime hangs
> >> while executing only safe code...
> >>
> >>      Mika
> >>
> >> "Rodney M. Bates" writes:
> >>> I know of one other place the compilers rely on hardware memory protection
> >>> to detect a checked runtime error, and that is stack overflow.  This won't
> >>> corrupt anything, but is hard to distinguish from dereferencing NIL.
> >>> This could probably be distinguished after the fact by some low-level,
> >>> target-dependent code.  I have found it by looking at assembly code at
> >>> the point of failure--usually right after a stack pointer push.
> >>>
> >>> Detecting this via compiler-generated checks would probably be more
> >>> extravagant than many other checks, as it is so frequent.  I am not
> >>> aware of any really good solution to this in any implementation of any
> >>> language.
> >>>
> >>> On 02/19/2011 02:38 PM, Mika Nystrom wrote:
> >>>> Jay, sometimes I wonder about you: this is a Modula-3 mailing list,
> >>>> you know!
> >>>>
> >>>> "Corrupting the heap" is something that can only happen as a result of
> >>>> an unchecked runtime error.  Unchecked runtime errors cannot happen in
> >>>> modules not marked UNSAFE.
> >>>>
> >>>> SEGV is, however, used by the CM3 implementation (and its predecessors)
> >>>> to signal a certain kind of *checked* runtime error, namely, the
> >>>> dereferencing of a NIL reference.  Correct me if I am wrong, but an
> >>>> attempt to dereference NIL is not going to leave the heap corrupted?
> >>>>
> >>>> And if you stick to safe code, the only SEGVs I think you get in the
> >>>> current CM3 are ones from NIL dereferences.
> >>>>
> >>>> Hence, as long as you stick with safe code, the only time the code I
> >>>> checked in earlier gets triggered is for NIL dereferences, which should
> >>>> never corrupt the heap.  So SEGV is not sometimes, but in fact always
> >>>> recoverable.
> >>>>
> >>>> :-)
> >>>>
> >>>>       Mika
> >>>>
> >>>> P.S. the bit above "if you stick to safe code": if you actually program in
> >>>> Modula-3 you almost never use UNSAFE.  I went through my repository and
> >>>> I have 40 modules using UNSAFE out of a total of 4,559.  Furthermore,
> >>>> many of the UNSAFE modules are glue code to Fortran routines, which
> >>>> could relatively easily be verified to be safe in the Modula-3 sense.
> >>>> Almost all what remains is glue to some C library, which wouldn't be
> >>>> necessary if the rest of the world would wake up out of the dark ages, but
> >>>> I don't have the time to rewrite every single library from scratch myself.
> >>>>
> >>>>
> >>>> Jay K writes:
> >>>>> --_a2a24b92-3b4c-456e-ab1b-c3f5e912854f_
> >>>>> Content-Type: text/plain; charset="iso-8859-1"
> >>>>> Content-Transfer-Encoding: quoted-printable
> >>>>>
> >>>>>
> >>>>> Letting any code run after a SIGSEGV is dubious.
> >>>>> Imagine the heap is corrupted.
> >>>>> And then you run more code.
> >>>>> And the code happens to call malloc.
> >>>>> Or printf to log something.
> >>>>> =20
> >>>>> I suppose there might be an application that maps memory
> >>>>> gradually=2C as pieces of a buffer are hit. Might.
> >>>>> =20
> >>>>> - Jay
> >>>>> =20
> >>>>>> To: m3devel at elegosoft.com
> >>>>>> Date: Sat=2C 19 Feb 2011 10:29:30 -0800
> >>>>>> From: mika at async.caltech.edu
> >>>>>> Subject: [M3devel] SEGV mapping to RuntimeError
> >>>>>> =20
> >>>>>> =20
> >>>>>> Dear m3devel=2C
> >>>>>> =20
> >>>>>> For a while it has annoyed me that segmentation violations cause an
> >>>>>> unconditional program abort. I've changed that now so that (under user
> >>>>>> threads at least) we instead get a RuntimeError. Here's an example of
> >>>>>> the mechanism at work in an interactive Scheme environment. Consider
> >>>>>> the unhelpful interface and module Crash:
> >>>>>> =20
> >>>>>> INTERFACE Crash=3B PROCEDURE Me()=3B END Crash.
> >>>>>> =20
> >>>>>> MODULE Crash=3B
> >>>>>> =20
> >>>>>> PROCEDURE Me() =3D
> >>>>>> VAR ptr : REF INTEGER :=3D NIL=3B BEGIN
> >>>>>> ptr^ :=3D 0
> >>>>>> END Me=3B
> >>>>>> =20
> >>>>>> BEGIN END Crash.
> >>>>>> =20
> >>>>>> Here's an example of what happens if you now call this from an interactiv=
> >>>>> e
> >>>>>> interpreter that catches the exception RuntimeError.E:
> >>>>>> =20
> >>>>>> M-Scheme Experimental
> >>>>>> LITHP ITH LITHENING.
> >>>>>>> (require-modules "m3")
> >>>>>> #t
> >>>>>>> (Crash.Me)
> >>>>>> EXCEPTION! RuntimeError! Attempt to reference an illegal memory location.
> >>>>>>> (+ 3 4)=20
> >>>>>> 7
> >>>>>>> =20
> >>>>>> =20
> >>>>>> I just realized I may have broken pthreads=2C let me go back and double-c=
> >>>>> heck it.=20
> >>>>>> runtime/POSIX and thread/POSIX don't refer to the same thing do they...
> >>>>>> =20
> >>>>>> Mika
> >>>>>> =20
> >>>>> 		 	   		=
> >>>>>
> >>>>> --_a2a24b92-3b4c-456e-ab1b-c3f5e912854f_
> >>>>> Content-Type: text/html; charset="iso-8859-1"
> >>>>> Content-Transfer-Encoding: quoted-printable
> >>>>>
> >>>>> <html>
> >>>>> <head>
> >>>>> <style><!--
> >>>>> .hmmessage P
> >>>>> {
> >>>>> margin:0px=3B
> >>>>> padding:0px
> >>>>> }
> >>>>> body.hmmessage
> >>>>> {
> >>>>> font-size: 10pt=3B
> >>>>> font-family:Tahoma
> >>>>> }
> >>>>> --></style>
> >>>>> </head>
> >>>>> <body class=3D'hmmessage'>
> >>>>> Letting any code run after a SIGSEGV is dubious.<BR>
> >>>>> Imagine the heap&nbsp=3Bis corrupted.<BR>
> >>>>> And then you run more code.<BR>
> >>>>> And the code happens to call malloc.<BR>
> >>>>> Or printf to log something.<BR>
> >>>>> &nbsp=3B<BR>
> >>>>> I suppose there might be an application that maps memory<BR>
> >>>>> gradually=2C as pieces of a buffer are hit. Might.<BR>
> >>>>> &nbsp=3B<BR>
> >>>>> &nbsp=3B- Jay<BR>&nbsp=3B<BR>
> >>>>> &gt=3B To: m3devel at elegosoft.com<BR>&gt=3B Date: Sat=2C 19 Feb 2011 10:29:3=
> >>>>> 0 -0800<BR>&gt=3B From: mika at async.caltech.edu<BR>&gt=3B Subject: [M3devel]=
> >>>>> SEGV mapping to RuntimeError<BR>&gt=3B<BR>&gt=3B<BR>&gt=3B Dear m3devel=
> >>>>> =2C<BR>&gt=3B<BR>&gt=3B For a while it has annoyed me that segmentation vi=
> >>>>> olations cause an<BR>&gt=3B unconditional program abort. I've changed that =
> >>>>> now so that (under user<BR>&gt=3B threads at least) we instead get a Runtim=
> >>>>> eError. Here's an example of<BR>&gt=3B the mechanism at work in an interact=
> >>>>> ive Scheme environment. Consider<BR>&gt=3B the unhelpful interface and modu=
> >>>>> le Crash:<BR>&gt=3B<BR>&gt=3B INTERFACE Crash=3B PROCEDURE Me()=3B END Cra=
> >>>>> sh.<BR>&gt=3B<BR>&gt=3B MODULE Crash=3B<BR>&gt=3B<BR>&gt=3B PROCEDURE Me(=
> >>>>> ) =3D<BR>&gt=3B VAR ptr : REF INTEGER :=3D NIL=3B BEGIN<BR>&gt=3B ptr^ :=3D=
> >>>>> 0<BR>&gt=3B END Me=3B<BR>&gt=3B<BR>&gt=3B BEGIN END Crash.<BR>&gt=3B<BR>=
> >>>>> &gt=3B Here's an example of what happens if you now call this from an inter=
> >>>>> active<BR>&gt=3B interpreter that catches the exception RuntimeError.E:<BR>=
> >>>>> &gt=3B<BR>&gt=3B M-Scheme Experimental<BR>&gt=3B LITHP ITH LITHENING.<BR>&=
> >>>>> gt=3B&gt=3B (require-modules "m3")<BR>&gt=3B #t<BR>&gt=3B&gt=3B (Crash.Me=
> >>>>> )<BR>&gt=3B EXCEPTION! RuntimeError! Attempt to reference an illegal memory=
> >>>>> location.<BR>&gt=3B&gt=3B (+ 3 4)<BR>&gt=3B 7<BR>&gt=3B&gt=3B<BR>&gt=
> >>>>> =3B<BR>&gt=3B I just realized I may have broken pthreads=2C let me go back=
> >>>>> and double-check it.<BR>&gt=3B runtime/POSIX and thread/POSIX don't refer=
> >>>>> to the same thing do they...<BR>&gt=3B<BR>&gt=3B Mika<BR>&gt=3B<BR>   		 	=
> >>>>>     		</body>
> >>>>> </html>=
> >>>>>
> >>>>> --_a2a24b92-3b4c-456e-ab1b-c3f5e912854f_--
> >>>>
> >>
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20110221/3d63b593/attachment-0002.html>


More information about the M3devel mailing list