[M3devel] small objects

Rodney M. Bates rodney.m.bates at cox.net
Mon Apr 6 21:11:29 CEST 2009


At first, I just envisioned implementing a small object type
entirely inside an UNSAFE module, using unsafe operations
to construct/test/separate the values.  I think if the GC were
to just to ignore words in heap objects that the type system
claims are unambiguously pointers but actually have misaligned
values, this could be done.

But it flies in the face of a fundamental principle of Modula-3,
namely that an UNSAFE module, if coded correctly, can prevent
all other code from consequences of the unsafety.  Unfortunately,
this can't be done with the small pointers, because, even if
opaque, they are still reference types, and client code can
dereference them (including the implicit dereference in accessing
a method or field of an object) and do ISTYPE, NARROW, TYPECASE,
and assignments that do an implicit NARROW.  All these operations
could be done outside the implementing module and all could
suffer from misaligned values.  

In fact,  misaligned integer "objects" violate another principle that
the bit pattern must always be a valid representation of some value
of the type.  

However, this does suggest a different and very simple linguistic
solution:  Just have a new builtin type, say VERYOPAQUE, that
supports no operations other than moving it around, i.e.,
assignment, bindings,  passing as a parameter, etc.   Only unsafe
code could do anyhing with it, by using LOOPHOLE.   The only
problem is, would somebody then want one of these in a size other
than one word?

Mika Nystrom wrote:
> Ah I should read my whole inbox before replying.
>
> I take it you would worry that in my proposal (to do this all in a
> library) the use of REFANY to store an integer could be abused
> somehow.  That the Modula-3 type system (as it exists now) explicitly
> rules out doing such things, and therefore, a language change is
> necessary....
>
>     Mika
>
> Tony Hosking writes:
>   
>> I like the notion of having a TAGGED INTEGER type that is a hybrid  
>> ordinal/reference.
>>
>> Subtyping rules for references would now be as follows:
>>
>> NULL <: REF T <: REFANY
>> TAGGED INTEGER <: REF T <: REFANY
>>
>> ROOT <: REFANY
>> NULL <: T OBJECT ... END <: T
>> TAGGED INTEGER <: T OBJECT ... END <: T (but only for T <: REFANY, not  
>> T <: ADDRESS)
>>
>> Thus, TAGGED INTEGER is a funny sort of hybrid ordinal/reference  
>> type.  Values of TAGGED INTEGER are non-pointer values similar to the  
>> value NIL.  We can then do ISTYPE(x, TAGGED INTEGER), and NARROW(x,  
>> TAGGED INTEGER), and TYPECODE(TAGGED INTEGER), similarly to ISTYPE(x,  
>> NULL), NARROW(x, NULL), and TYPECODE(NULL).
>>
>> Because TAGGED INTEGER is an ordinal we can extract the integer value  
>> it holds using ORD(x).
>> Similarly, we can construct a TAGGED INTEGER using VAL(x, TAGGED  
>> INTEGER).
>>
>> ***The only problem with this scheme is how to efficiently perform run- 
>> time tests for dereferencing NULL, and TAGGED INTEGER?***
>>
>> So, here is a slightly less elegant alternative that is probably  
>> straightforward to implement.  Introduce a separate TAGGED hierarchy.
>>
>> NULL <: REF T <: REFANY
>> NULL <: UNTRACED REF T <: ADDRESS
>> TAGGED INTEGER <: TAGGED REF T <: TAGGED REFANY
>>
>> Note that NULL is not a subtype of TAGGED REF T or TAGGED REFANY.
>>
>> ROOT <: REFANY
>> TAGGED ROOT <: TAGGED REFANY
>> NULL <: T OBJECT END <: T where T <: REFANY or T <: ADDRESS
>> TAGGED INTEGER <: T OBJECT ... END <: T where T <: TAGGED REFANY
>>
>> Note that NULL is not a subtype of T OBJECT ... END where T <: TAGGED  
>> REFANY.
>>
>> This way, tagged references only need a run-time test for the tag on  
>> dereference (we can throw an "attempt to dereference TAGGED INTEGER"  
>> at run time, just like we throw "attempt to dereference NIL" for  
>> untagged references).  This check can be a straightforward test of the  
>> low bit tag.  Of course, the weird thing here is now that we don't  
>> have a single NULL value for TAGGED references --- we have multiple  
>> null values VAL(x, TAGGED INTEGER).  Instead of asking "x == NIL" we  
>> must ask "ISTYPE(x, TAGGED INTEGER)".
>>
>> Of course, because TAGGED INTEGER is an ordinal, we can also perform  
>> the usual ordinal operations like INC(x), DEC(x), wherever x is typed  
>> TAGGED INTEGER.
>>
>>
>> On 6 Apr 2009, at 11:44, Rodney M. Bates wrote:
>>
>>     
>>> I spent quite a bit of time playing with a more general system where
>>> there is a set of "tagged" types, with (an implementation-defined
>>> subrange of) INTEGER and a reference type both being a subtype of a
>>> tagged type.  It might have been tolerable, if it weren't for the
>>> interaction with object subtypes and opaque types, which quickly
>>> gets deep into a deep linguistic tar pit.
>>>
>>> Tony's approach is much simpler and would serve what I see as the
>>> need.  It does sacrifice any static checking of what reference type
>>> is being tagged, and will also require an extra runtime ISTYPE/ 
>>> TYPECASE.
>>>
>>> Would INTEGER and REFANY be assignable to TAGGED, or would there
>>> also need to be an explicit conversion in that direction too, say
>>> VAL(x, TAGGED)?  I think I favor the explicit conversion here.  It
>>> is a bit inconsistent with the usual Modula-3 assignability  
>>> philosophy,
>>> but not requiring the explicit conversion already gets your toes  
>>> into tar.
>>>
>>> We would have to have something more like ISTYPE, though, which will
>>> tell which type the value belongs to without risking a runtime error,
>>> which VAL would do.
>>>
>>> An intermediate approach might be to have a set of tagged types
>>> constructed by, say, TAGGED T, where I is a reference type, but
>>> with no subtype relations on them at all, thus still requiring
>>> the explicit conversions and checks.
>>>
>>> I do want to say, I _really_ want this capability somehow.  I have
>>> an ADT module whose internal data structure, for full generality,
>>> needs to have two heap objects (the second because it has nonstatic
>>> size) and 11 total words counting the orginal pointer, in the case of
>>> values that would fit directly  into the "pointer" word.  Moreover,
>>> these small cases are in the great majority.
>>>
>>> Besides the 11-to-one increase in actual occupied space, there
>>> is extra time for allocation, periodic tracing, and collection, space
>>> loss due to heap fragmentation, and time/space tradeoff loss due to
>>> reduced locality of reference.  So sometimes, it would be a big
>>> performance gain if we could do it.
>>>
>>>
>>> Tony Hosking wrote:
>>>       
>>>> It is a much more pervasive hack than I would be comfortable with
>>>> since it touches on the compiler (for all the type operations) as  
>>>> well
>>>> as the run-time system in ways that are pretty gross.  I would much
>>>> rather have a new TAGGED builtin.  ISTYPE would not work on it since
>>>> that only works on references.  The only thing you need is a way to
>>>> extract the value -- we could overload VAL(x, T) where x can be a
>>>> TAGGED and T can be REFANY or INTEGER.
>>>>
>>>>         
>
>   




More information about the M3devel mailing list