[M3devel] Target.First_readable_addr

Rodney M. Bates rodney_bates at lcwb.coop
Thu Aug 27 04:23:05 CEST 2015



On 08/26/2015 11:10 AM, Jay wrote:
> Passing a ^ to a VAR imho should probably not count as a deref -- I agree with the implementation. Just like in C++ dereferencing a pointer to get a reference instead of a value. One wants a stack trace, and a runtime check maybe applied after the fact.
> Ie. once we fault, show all nil parameters on the stack to all functions to help me fish for it. Not easy.
>

No, I disagree adamantly.  The whole idea of linguistic support of parameters passed by
reference, instead of the programmer manually concocting it by taking addresses and
contents of pointers, is that the language ensures to the called procedure that the
hidden pointer will be neither NIL, uninitialized, nor point to something of the
wrong type.

In fact, pass by ref is not even defined in terms of the compiler lowering into pointer
twiddling.  It is defined abstractly as the formal is "bound to", i.e. "aliases" the actual
(2.3.2).  This is a highly proper subset of the full semantics of what pointers can do.
NIL or uninitialized have no meaning in this definition, and wrong type is a failure of the
type system that should only happen in UNSAFE code.

NIL deref, as a runtime error, should in concept happen when and where ^ is applied
to a pointer, including the implied dereference when a subscript or field selection is
applied directly without the ^ operator.  It's purely an implementation question how to
do this.  If a segfault message will point to the line of code where the dereference
happens, and before execution proceeds beyond that code, that would be marginally OK.
(Only marginally, because the error message should be able to say with certainty that
it's a NIL deref.

Fortunately, in safe Modula-3 code, there are not a lot of other ways to segfault
(any?).  Unfortunately, there's plenty of unsafe code, and if its programmer fails his
obligation, its effects can escape into safe code.

>
>   - Jay
>
> On Aug 26, 2015, at 8:37 AM, "Rodney M. Bates" <rodney_bates at lcwb.coop> wrote:
>
>> I have always thought "segfault, could it be a NIL pointer deref?", or whatever
>> words to that effect, was a rather lame excuse of an error message, and not
>> much help.
>>
>> Since this scheme doesn't fault at the point of the real deref, but afterwards, when a
>> memory reference is actually made, possibly computed from that address, could the location
>> given for the fault (whether line no or just code displacement) be wrong?  Very wrong?
>>
>> TYPE A = ARRAY [ 0 .. 121 ] OF INTEGER;
>>
>> PROCEDURE Q ( )
>> = VAR Ptr : REF A := NIL;
>>   BEGIN
>>      P(Ptr^);           <---deref occurs here
>>   END Q;
>>
>> .... lots of other code, different module, etc. ...
>>
>> PROCEDURE P ( VAR x : A )
>> = BEGIN
>>     x[119] := 17       <---memory ref & segfault don't happen until here.
>>   END P;
>>
>> Or is this scheme only used in selective places, such as implicit deref
>> when putting a subscript onto a pointer? e.g.:
>>
>> Ptr[117] := 43;
>>
>> On 08/25/2015 07:28 PM, Jay K wrote:
>>> Of course, my agenda is to remove target-specificity.
>>>
>>>
>>>
>>> Target.First_readable_addr is a target-specific optimization
>>> to omit null checks. Or maybe range checks..
>>>
>>>
>>> The assumption is that dereferencing "actual" null is checked
>>> by the hardware, so code doesn't have to be generated to check for null.
>>>
>>>
>>> I'm not even sure I like this -- as the behavior of dereferencing null is not portable.
>>>
>>>
>>> The assumption is further that accesses to the first page, at least,
>>> are subject to the same hardware checks, and can also be optimized away.
>>>
>>>
>>> At first, glance, I thought this was based on offset + size of a static access.
>>>
>>>
>>> For example:
>>>
>>> a: ARRAY [0...large] OF CHAR
>>>
>>>
>>> a[0..4095]   no check
>>> a[4096 and up] would check
>>>
>>>
>>> Target.First_deadable_addr is 4k on many targets; 8k on others.
>>> Setting it too low would yield extra checks.
>>> Setting it too high would yield missing checks and a loss of safety.
>>>
>>>
>>> Here is what I actually found though.
>>>
>>>
>>>   - The check is based on the size of the type being accessed.
>>>
>>>   - It is off by a factor of 8 -- there is confusing between m3middle and m3front
>>>    as to if it is bit number of a byte number.
>>>
>>>
>>> small: ARRAY[0..100] OF CHAR
>>> large:ARRAY[0..100000] OF CHAR
>>>
>>> no access to small gets checked and every access to larger gets checked.
>>>
>>> Should we do anything about this?
>>>
>>> In m3-sys/m3tests/src/p2/p263:
>>> cm3 -realclean
>>> cm3 -keep
>>> grep fault <target>/*ms
>>>
>>>
>>> All the accesses are to offset 0.
>>> So, by some expectation, no null checks are needed.
>>> Null checks are output when the size of the
>>> containing type is, for x86, larger than 4096*8.
>>>
>>>
>>> The checks have been largely but not completely wrong/missing.
>>> Safety behooves us to check though.
>>>
>>>   - fix the factor of 8?
>>>   - make it 0?? too slow?
>>>   - make it 4k on all target? until such time as a target manifests with a smaller page size?
>>>   - base the checks on access offset + size, not containing size?
>>>     Containing size is conservative. It checks more than i think is meant.
>>>
>>>
>>> I couldn't actually figure out the code here, I added various:
>>>
>>>      IF RTParams.IsPresent("qual") THEN
>>>        RTIO.PutText("NilChkExpr.Compile p.offset:" & Fmt.Int(p.offset) & "\n");
>>>        RTIO.Flush();
>>>      END;
>>>
>>> and such to m3front to figure it out.
>>>
>>>
>>> Thanks,
>>> - Jay
>>>
>>>
>>>
>>> _______________________________________________
>>> M3devel mailing list
>>> M3devel at elegosoft.com
>>> https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
>>
>> --
>> Rodney Bates
>> rodney.m.bates at acm.org
>

-- 
Rodney Bates
rodney.m.bates at acm.org



More information about the M3devel mailing list