[M3devel] Target.First_readable_addr

Jay K jay.krell at cornell.edu
Fri Aug 28 02:40:32 CEST 2015


That is done -- it is a constant 4k now.
Hypothetically Sparc/IA64/Alpha could be 8k. Sparc used to be.
 
 
I'm still leary of the omitted checks and suggest we compare
against C# (JIT and native, Microsoft and Mono), Java, Go, Rust, Gnat, etc.
If most/all of them check aggressively, we should to imho.
If they actually omit more aggressively, which I believe is possible, we should consider that.
 
In particular, I think the omission should be based on the offset+size of the access,
not of the containing type. i.e. the field and not the record.
However this might be a fallacy also.
 
e.g.
 
TYPE BigRecord = RECORD ...lots.... INTEGER smallFieldPast4K END;
 
PROCEDURE F1(VAR a:INTEGER);
 
PROCEDURE F2(b:REF BigRecord)
BEGIN
  F1(b.a);
END F2;
 
Hm. I see. Maybe you have to check at b.a and a (not shown here)?
The recipient of a "small" type doesn't know its offset, if any, within a larger type,
and it can vary.
 
I can have:
TYPE SmallRecord =  RECORD .. nothing .. INTEGER smallFieldAtStart END;
 
F1(smallRecord.a);
 
So maybe you are right anyway -- check needed at the "address generation" not just the deref.
 
I need to study this more.
 
Another idea, perhaps, is that if there is any dereference of a parameter within a function,
do all the checks up front, and then none in the function. You know -- in case there are loops.
 
But I also don't know the guarantees around "dynamic paths" -- turning conditional dereferences
into unconditional dereferences.
 
I need to study this more. :)
 
 - Jay

 
> Date: Thu, 27 Aug 2015 17:20:58 -0500
> From: rodney_bates at lcwb.coop
> To: hosking at purdue.edu
> CC: m3devel at elegosoft.com; jay.krell at cornell.edu
> Subject: Re: [M3devel] Target.First_readable_addr
> 
> In any case, I would shed no tears if First_readable_addr were set target-indepenently
> to the lowest value on any target, since that would only increase the number of cases
> that give the proper error at the proper place & time.
> 
> On 08/26/2015 10:07 PM, Antony Hosking wrote:
> > The compiler does in fact leave a nil check on every deref (both implicit and explicit) except in the case that the offset falls within the range that can be caught with a trap.  In which case, yes the check is delayed until the actual memory access.  It is perhaps not so bad that the offending deref will be visible up the stack in the case of VAR params. So, pragmatically I think it is nice to have zero-cost null checks, but it does compromise Rodney’s desire for every use of ^ to have the null check at that point.
> >
> >> On Aug 27, 2015, at 12:23 PM, Rodney M. Bates <rodney_bates at lcwb.coop> wrote:
> >>
> >>
> >>
> >> On 08/26/2015 11:10 AM, Jay wrote:
> >>> Passing a ^ to a VAR imho should probably not count as a deref -- I agree with the implementation. Just like in C++ dereferencing a pointer to get a reference instead of a value. One wants a stack trace, and a runtime check maybe applied after the fact.
> >>> Ie. once we fault, show all nil parameters on the stack to all functions to help me fish for it. Not easy.
> >>>
> >>
> >> No, I disagree adamantly.  The whole idea of linguistic support of parameters passed by
> >> reference, instead of the programmer manually concocting it by taking addresses and
> >> contents of pointers, is that the language ensures to the called procedure that the
> >> hidden pointer will be neither NIL, uninitialized, nor point to something of the
> >> wrong type.
> >>
> >> In fact, pass by ref is not even defined in terms of the compiler lowering into pointer
> >> twiddling.  It is defined abstractly as the formal is "bound to", i.e. "aliases" the actual
> >> (2.3.2).  This is a highly proper subset of the full semantics of what pointers can do.
> >> NIL or uninitialized have no meaning in this definition, and wrong type is a failure of the
> >> type system that should only happen in UNSAFE code.
> >>
> >> NIL deref, as a runtime error, should in concept happen when and where ^ is applied
> >> to a pointer, including the implied dereference when a subscript or field selection is
> >> applied directly without the ^ operator.  It's purely an implementation question how to
> >> do this.  If a segfault message will point to the line of code where the dereference
> >> happens, and before execution proceeds beyond that code, that would be marginally OK.
> >> (Only marginally, because the error message should be able to say with certainty that
> >> it's a NIL deref.
> >>
> >> Fortunately, in safe Modula-3 code, there are not a lot of other ways to segfault
> >> (any?).  Unfortunately, there's plenty of unsafe code, and if its programmer fails his
> >> obligation, its effects can escape into safe code.
> >>
> >>>
> >>>   - Jay
> >>>
> >>> On Aug 26, 2015, at 8:37 AM, "Rodney M. Bates" <rodney_bates at lcwb.coop> wrote:
> >>>
> >>>> I have always thought "segfault, could it be a NIL pointer deref?", or whatever
> >>>> words to that effect, was a rather lame excuse of an error message, and not
> >>>> much help.
> >>>>
> >>>> Since this scheme doesn't fault at the point of the real deref, but afterwards, when a
> >>>> memory reference is actually made, possibly computed from that address, could the location
> >>>> given for the fault (whether line no or just code displacement) be wrong?  Very wrong?
> >>>>
> >>>> TYPE A = ARRAY [ 0 .. 121 ] OF INTEGER;
> >>>>
> >>>> PROCEDURE Q ( )
> >>>> = VAR Ptr : REF A := NIL;
> >>>>   BEGIN
> >>>>      P(Ptr^);           <---deref occurs here
> >>>>   END Q;
> >>>>
> >>>> .... lots of other code, different module, etc. ...
> >>>>
> >>>> PROCEDURE P ( VAR x : A )
> >>>> = BEGIN
> >>>>     x[119] := 17       <---memory ref & segfault don't happen until here.
> >>>>   END P;
> >>>>
> >>>> Or is this scheme only used in selective places, such as implicit deref
> >>>> when putting a subscript onto a pointer? e.g.:
> >>>>
> >>>> Ptr[117] := 43;
> >>>>
> >>>> On 08/25/2015 07:28 PM, Jay K wrote:
> >>>>> Of course, my agenda is to remove target-specificity.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Target.First_readable_addr is a target-specific optimization
> >>>>> to omit null checks. Or maybe range checks..
> >>>>>
> >>>>>
> >>>>> The assumption is that dereferencing "actual" null is checked
> >>>>> by the hardware, so code doesn't have to be generated to check for null.
> >>>>>
> >>>>>
> >>>>> I'm not even sure I like this -- as the behavior of dereferencing null is not portable.
> >>>>>
> >>>>>
> >>>>> The assumption is further that accesses to the first page, at least,
> >>>>> are subject to the same hardware checks, and can also be optimized away.
> >>>>>
> >>>>>
> >>>>> At first, glance, I thought this was based on offset + size of a static access.
> >>>>>
> >>>>>
> >>>>> For example:
> >>>>>
> >>>>> a: ARRAY [0...large] OF CHAR
> >>>>>
> >>>>>
> >>>>> a[0..4095]   no check
> >>>>> a[4096 and up] would check
> >>>>>
> >>>>>
> >>>>> Target.First_deadable_addr is 4k on many targets; 8k on others.
> >>>>> Setting it too low would yield extra checks.
> >>>>> Setting it too high would yield missing checks and a loss of safety.
> >>>>>
> >>>>>
> >>>>> Here is what I actually found though.
> >>>>>
> >>>>>
> >>>>>   - The check is based on the size of the type being accessed.
> >>>>>
> >>>>>   - It is off by a factor of 8 -- there is confusing between m3middle and m3front
> >>>>>    as to if it is bit number of a byte number.
> >>>>>
> >>>>>
> >>>>> small: ARRAY[0..100] OF CHAR
> >>>>> large:ARRAY[0..100000] OF CHAR
> >>>>>
> >>>>> no access to small gets checked and every access to larger gets checked.
> >>>>>
> >>>>> Should we do anything about this?
> >>>>>
> >>>>> In m3-sys/m3tests/src/p2/p263:
> >>>>> cm3 -realclean
> >>>>> cm3 -keep
> >>>>> grep fault <target>/*ms
> >>>>>
> >>>>>
> >>>>> All the accesses are to offset 0.
> >>>>> So, by some expectation, no null checks are needed.
> >>>>> Null checks are output when the size of the
> >>>>> containing type is, for x86, larger than 4096*8.
> >>>>>
> >>>>>
> >>>>> The checks have been largely but not completely wrong/missing.
> >>>>> Safety behooves us to check though.
> >>>>>
> >>>>>   - fix the factor of 8?
> >>>>>   - make it 0?? too slow?
> >>>>>   - make it 4k on all target? until such time as a target manifests with a smaller page size?
> >>>>>   - base the checks on access offset + size, not containing size?
> >>>>>     Containing size is conservative. It checks more than i think is meant.
> >>>>>
> >>>>>
> >>>>> I couldn't actually figure out the code here, I added various:
> >>>>>
> >>>>>      IF RTParams.IsPresent("qual") THEN
> >>>>>        RTIO.PutText("NilChkExpr.Compile p.offset:" & Fmt.Int(p.offset) & "\n");
> >>>>>        RTIO.Flush();
> >>>>>      END;
> >>>>>
> >>>>> and such to m3front to figure it out.
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>> - Jay
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> M3devel mailing list
> >>>>> M3devel at elegosoft.com
> >>>>> https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
> >>>>
> >>>> --
> >>>> Rodney Bates
> >>>> rodney.m.bates at acm.org
> >>>
> >>
> >> --
> >> Rodney Bates
> >> rodney.m.bates at acm.org
> >> _______________________________________________
> >> M3devel mailing list
> >> M3devel at elegosoft.com
> >> https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
> >
> >
> 
> -- 
> Rodney Bates
> rodney.m.bates at acm.org
> _______________________________________________
> M3devel mailing list
> M3devel at elegosoft.com
> https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20150828/9c8c08e2/attachment-0002.html>


More information about the M3devel mailing list