[M3devel] small objects

Thu Apr 9 03:57:35 CEST 2009

Thanks Mika, nicely summarised.  I'll just add that from the language  
design perspective we have made no change.  The only thing we are  
doing is introducing a run-time *optimisation* that allows things that  
look like:

T = BRANDED "SmallInt.T" REF [FIRST(INTEGER) DIV 2.. LAST(INTEGER) DIV  
2]

to be *represented* by the run-time system as a tagged reference by  
packing the value into the non-tag bits of the reference.

The main thing to protect against is that someone bypasses the opaque  
definition of this type as <: REFANY and actually allocates a heap  
value of the above type.  The only way to do that is to use  
RTAllocator.NewTraced(tc) with TYPECODE(T).  The run-time system can  
explicitly check for that in RTAllocator and return a properly tagged  
reference.

On 9 Apr 2009, at 10:23, Mika Nystrom wrote:

> Hmm, if you're going to do something along these lines, isn't the  
> logical
> thing to do to introduce a new root of the (traced) type system, that
> is
>
>    REFANY <: REFORINT
>    SMALLINT = [MinSmallInteger .. MaxSmallInteger]
>    SMALLINT <: REFORINT
>
> a REFORINT can be a REFANY or a small integer.  It stands to reason  
> that
> it is a supertype of REFANY as well as of SMALLINT, because isn't
> that exactly what we're looking for?
>
> The problem with this proposal (as well as both of yours) is that
> you get to go through and inspect all existing code that takes
> REFANY and wonder whether maybe it should take REFORINT instead....
>
> The proposal that Tony and I seem to have agreed on, on the other
> hand, allows the implementation to hide small integers completely
> transparently in the already existing REFANY, at the cost of an LSB
> check on a number of operations, such as ISTYPE, TYPECASE, etc.,
> to any REFANY.
>
> I maintain the following:
>
> I. Since hiding integers in REFANY is intended to be entirely
>   transparent, you could have it as a compile or even run-time
>   option.  Yes, ok this is ugly.
>
> II.Assume we go with REFORINT, or OPAQUE, or TAGGED.  What existing
>   uses of REFANY, in any code of importance, is going to stay
>   REFANY, and what uses will we want to convert to the new types?
>
>   I would suggest considering the following argument:
>
>   A. The new types are strictly more useful than REFANY; they can
>      represent more values because they can represent all of REFANY
>      plus other values (after all that was the point all along).
>
>   B. Existing uses of REFANY are either
>     1. uses of REFANY because the code is intended to be very generic.
>        Programmers will want to migrate such code to the new type,
>        so that they can process the new type as well (since the code  
> is
>        generic it should work for either).
>     2. uses of REFANY because the programmer was too lazy to  
> instantiate
>        a generic for his specific type and decided to cast the values
>        to REFANY (e.g., to insert in a RefList) and back.  Such  
> programmers
>        deserve what they get.
>
>   I conclude that code that continues to want to use REFANY rather
>   than the new type is code that is not performance sensitive.  Hence,
>   I conclude that using REFANY directly to represent small integers
>   is as reasonable as introducing a new type, and does it with less  
> work,
>   both in updating the language definition and in updating all the
>   libraries in libm3 and elsewhere.
>
> What current uses of REFANY do not fit the cases in II.B.?  Some
> trickery to simulate multiple inheritance?  Even in those you should
> have created a common supertype, I think.
>
>    Mika
>
> "Rodney M. Bates" writes:
>> Here are two integrated proposals for language changes to allow small
>> objects, while preserving the principles of the language.  Neither of
>> them changes the semantics of any existing type or burdens any
>> existing code with extra runtime computation.
>>
>> The unsafe proposal:
>>
>> This is a much simpler proposal.  It could also be used in more
>> general ways than just using a tag bit to distinguish a true pointer
>> from an in-word value.  This is at the cost of requiring unsafe  
>> coding
>> techniques in a module implementing the abstraction, which also
>> inevitably means implementation-dependent too.  Using it for small
>> objects as discussed still requires the garbage collector not to be
>> confused by trying to trace a reference with a misaligned value.
>>
>> There is a new builtin type OPAQUE.  The only things you can do with
>> it in safe code are copy values around (assigment, pass by VALUE,
>> RETURN, etc.) and make reference bindings to it (pass by reference,
>> WITH bindings).  Not that this ia a _lot_ more opaque than opaque
>> types.
>>
>> OPAQUE is known to be the same size as Word.T.  In unsafe code, you
>> can use this fact to LOOPHOLE it to anything you want and bit-twiddle
>> to your heart's content from there.  So an unsafe module could
>> implement procedures that take OPAQUE parameters.
>>
>> Variation:
>>
>> Allow an opaque subtype of OPAQUE, which can be revealed to be any
>> type with statically the same size.  This would make it possible to
>> prevent client code from mixing up two different unrelated
>> abstractions that both happened to use OPAQUE.  In fact, without  
>> this,
>> the proposal really fails to preserve the existing principles of
>> safe/unsafe code.  This would require that the revelation be BRANDED,
>> which would either mean restricting the revelation to a reference  
>> type
>> (which could still be LOOPHOLEd to anything else) or allowing other
>> types to be BRANDED.
>>
>> Example:
>>
>> INTERFACE ADT;
>> TYPE T <: OPAQUE;
>> PROCEDURE P (x:T);
>> ...
>> END ADT
>>
>> UNSAFE MODULE ADT;
>> TYPE R = RECORD ... END;
>> REVEAL T = BRANDED REF R;
>>
>> PROCEDURE P (x:T)
>>   VAR I: INTEGER;
>>   BEGIN
>>     IF LOOPHOLE(x,INTEGER) MOD 2 # 0
>>     THEN
>>       I := Word.Shift(LOOPHOLE(x,INTEGER),-1);
>>       (* Use integer value I ... *)
>>     ELSE
>>       (* Use reference value x, according to its declared type.
>>          Prior to the check above, we couldn't afford to risk
>>          doing anything with it. ... *)
>>     END (* IF *);
>>   END P;
>> ...
>> END ADT;
>>
>> The safe proposal:
>>
>> This is more complicated and less general, but allows small object
>> abstractions to be done in an entirely safe and implementation-
>> independent way.  It abstracts away all the bit-level representation
>> questions.
>>
>> There is a new type TAG, which is a subrange of INTEGER, with
>> implementation-defined bounds (which would then be accessible by
>> FIRST/LAST).
>>
>> There is a new type constructor TAGGED.  If T is any traced reference
>> type or any untraced object type, TAGGED T defines a type, called a
>> _tagged_ type, whose value set is the union of the value sets of TAG
>> and T.
>>
>> TAG <: TAGGED T and T <: TAGGED T.
>>
>> There are no other subtype relations involving TAGGED T.
>>
>> Values of a tagged type can be copied and bound-to.
>>
>> The static rules for ISTYPE, NARROW, TYPECASE, and assignability of a
>> type to a type are liberalized to allow their application to a tagged
>> type, and to allow the type to be checked/converted-to to be any
>> subtype of the tagged type.  Already existing rules would require a
>> runtime check that the value is a member of the stated type.
>>
>> No other operations are possible on a tagged type.  A tagged type is
>> not a reference type and does not have a typecode.
>>
>> Example:
>>
>> INTERFACE ADT;
>> TYPE Op <: REFANY;
>> TYPE T = TAGGED Op;
>> PROCEDURE P (x:T);
>> ...
>> END ADT
>>
>> MODULE ADT;
>> TYPE R = RECORD ... END;
>> REVEAL Op = BRANDED REF R;
>>
>> PROCEDURE P (x:T)
>>   BEGIN
>>     TYPECASE x
>>     OF TAG (I)
>>     => (* Use integer value I ... *)
>>     | Op (RR)
>>     => (* Use reference value RR ... *)
>>     END (* IF *);
>>   END P;
>> ...
>> END ADT;
>>
>> Note: TAG is like CARDINAL in that it is "just like" a subrange of
>> INTEGER, but not equal to that subrange.  This enabled pickles to
>> change the size of values of type TAG and of tagged types, as it now
>> does with CARDINAL.
>>
>> Variation:
>>
>> The 4 runtime-checking operations can also be applied to any ordinal
>> type and can check that an ordinal or tagged type has a value
>> belonging to any subrange of the base type (or maybe just of the type
>> being narrowed).  This is pure fluff (well, maybe it is syntactic
>> sugar), but adds some consistency and has a real use:
>>
>> How many times have I coded the following?
>>
>> IF FIRST(SomeOrdinalType <= x AND x <= LAST(SomeOrdinalType) THEN ...
>>
>> or worse,
>>
>> IF FIRST(SomeOrdinalType <= F(x) AND F(x) <= LAST(SomeOrdinalType)  
>> THEN ...
>>
>> which I then feel morally obligated to recode using a local  
>> variable or
>> WITH-temp to eliminate the duplicated calls F(x).
>>
>> It could then become:
>>
>> IF ISTYPE (F(x), SomeOrdinalType) THEN ...
>>
>>
>>
>>
>>
>>