[M3devel] small objects

Tony Hosking hosking at cs.purdue.edu
Tue Apr 7 04:02:10 CEST 2009


On 7 Apr 2009, at 01:29, Mika Nystrom wrote:

> Ah I should read my whole inbox before replying.

As should I... ;-)

> I take it you would worry that in my proposal (to do this all in a
> library) the use of REFANY to store an integer could be abused
> somehow.  That the Modula-3 type system (as it exists now) explicitly
> rules out doing such things, and therefore, a language change is
> necessary....

My concern is the mismatch between NIL checking and tag-checking.  I'd  
like to partition tagged references from regular references to keep  
things clean.  I never liked overloading, especially for a systems  
language like Modula-3 where operations and typing should have a  
reasonable (and efficient) operational semantics.

>
>
>    Mika
>
> Tony Hosking writes:
>> I like the notion of having a TAGGED INTEGER type that is a hybrid
>> ordinal/reference.
>>
>> Subtyping rules for references would now be as follows:
>>
>> NULL <: REF T <: REFANY
>> TAGGED INTEGER <: REF T <: REFANY
>>
>> ROOT <: REFANY
>> NULL <: T OBJECT ... END <: T
>> TAGGED INTEGER <: T OBJECT ... END <: T (but only for T <: REFANY,  
>> not
>> T <: ADDRESS)
>>
>> Thus, TAGGED INTEGER is a funny sort of hybrid ordinal/reference
>> type.  Values of TAGGED INTEGER are non-pointer values similar to the
>> value NIL.  We can then do ISTYPE(x, TAGGED INTEGER), and NARROW(x,
>> TAGGED INTEGER), and TYPECODE(TAGGED INTEGER), similarly to ISTYPE(x,
>> NULL), NARROW(x, NULL), and TYPECODE(NULL).
>>
>> Because TAGGED INTEGER is an ordinal we can extract the integer value
>> it holds using ORD(x).
>> Similarly, we can construct a TAGGED INTEGER using VAL(x, TAGGED
>> INTEGER).
>>
>> ***The only problem with this scheme is how to efficiently perform  
>> run-
>> time tests for dereferencing NULL, and TAGGED INTEGER?***
>>
>> So, here is a slightly less elegant alternative that is probably
>> straightforward to implement.  Introduce a separate TAGGED hierarchy.
>>
>> NULL <: REF T <: REFANY
>> NULL <: UNTRACED REF T <: ADDRESS
>> TAGGED INTEGER <: TAGGED REF T <: TAGGED REFANY
>>
>> Note that NULL is not a subtype of TAGGED REF T or TAGGED REFANY.
>>
>> ROOT <: REFANY
>> TAGGED ROOT <: TAGGED REFANY
>> NULL <: T OBJECT END <: T where T <: REFANY or T <: ADDRESS
>> TAGGED INTEGER <: T OBJECT ... END <: T where T <: TAGGED REFANY
>>
>> Note that NULL is not a subtype of T OBJECT ... END where T <: TAGGED
>> REFANY.
>>
>> This way, tagged references only need a run-time test for the tag on
>> dereference (we can throw an "attempt to dereference TAGGED INTEGER"
>> at run time, just like we throw "attempt to dereference NIL" for
>> untagged references).  This check can be a straightforward test of  
>> the
>> low bit tag.  Of course, the weird thing here is now that we don't
>> have a single NULL value for TAGGED references --- we have multiple
>> null values VAL(x, TAGGED INTEGER).  Instead of asking "x == NIL" we
>> must ask "ISTYPE(x, TAGGED INTEGER)".
>>
>> Of course, because TAGGED INTEGER is an ordinal, we can also perform
>> the usual ordinal operations like INC(x), DEC(x), wherever x is typed
>> TAGGED INTEGER.
>>
>>
>> On 6 Apr 2009, at 11:44, Rodney M. Bates wrote:
>>
>>> I spent quite a bit of time playing with a more general system where
>>> there is a set of "tagged" types, with (an implementation-defined
>>> subrange of) INTEGER and a reference type both being a subtype of a
>>> tagged type.  It might have been tolerable, if it weren't for the
>>> interaction with object subtypes and opaque types, which quickly
>>> gets deep into a deep linguistic tar pit.
>>>
>>> Tony's approach is much simpler and would serve what I see as the
>>> need.  It does sacrifice any static checking of what reference type
>>> is being tagged, and will also require an extra runtime ISTYPE/
>>> TYPECASE.
>>>
>>> Would INTEGER and REFANY be assignable to TAGGED, or would there
>>> also need to be an explicit conversion in that direction too, say
>>> VAL(x, TAGGED)?  I think I favor the explicit conversion here.  It
>>> is a bit inconsistent with the usual Modula-3 assignability
>>> philosophy,
>>> but not requiring the explicit conversion already gets your toes
>>> into tar.
>>>
>>> We would have to have something more like ISTYPE, though, which will
>>> tell which type the value belongs to without risking a runtime  
>>> error,
>>> which VAL would do.
>>>
>>> An intermediate approach might be to have a set of tagged types
>>> constructed by, say, TAGGED T, where I is a reference type, but
>>> with no subtype relations on them at all, thus still requiring
>>> the explicit conversions and checks.
>>>
>>> I do want to say, I _really_ want this capability somehow.  I have
>>> an ADT module whose internal data structure, for full generality,
>>> needs to have two heap objects (the second because it has nonstatic
>>> size) and 11 total words counting the orginal pointer, in the case  
>>> of
>>> values that would fit directly  into the "pointer" word.  Moreover,
>>> these small cases are in the great majority.
>>>
>>> Besides the 11-to-one increase in actual occupied space, there
>>> is extra time for allocation, periodic tracing, and collection,  
>>> space
>>> loss due to heap fragmentation, and time/space tradeoff loss due to
>>> reduced locality of reference.  So sometimes, it would be a big
>>> performance gain if we could do it.
>>>
>>>
>>> Tony Hosking wrote:
>>>> It is a much more pervasive hack than I would be comfortable with
>>>> since it touches on the compiler (for all the type operations) as
>>>> well
>>>> as the run-time system in ways that are pretty gross.  I would much
>>>> rather have a new TAGGED builtin.  ISTYPE would not work on it  
>>>> since
>>>> that only works on references.  The only thing you need is a way to
>>>> extract the value -- we could overload VAL(x, T) where x can be a
>>>> TAGGED and T can be REFANY or INTEGER.
>>>>




More information about the M3devel mailing list