[M3devel] small objects
Tony Hosking
hosking at cs.purdue.edu
Thu Apr 9 03:57:35 CEST 2009
Thanks Mika, nicely summarised. I'll just add that from the language
design perspective we have made no change. The only thing we are
doing is introducing a run-time *optimisation* that allows things that
look like:
T = BRANDED "SmallInt.T" REF [FIRST(INTEGER) DIV 2.. LAST(INTEGER) DIV
2]
to be *represented* by the run-time system as a tagged reference by
packing the value into the non-tag bits of the reference.
The main thing to protect against is that someone bypasses the opaque
definition of this type as <: REFANY and actually allocates a heap
value of the above type. The only way to do that is to use
RTAllocator.NewTraced(tc) with TYPECODE(T). The run-time system can
explicitly check for that in RTAllocator and return a properly tagged
reference.
On 9 Apr 2009, at 10:23, Mika Nystrom wrote:
> Hmm, if you're going to do something along these lines, isn't the
> logical
> thing to do to introduce a new root of the (traced) type system, that
> is
>
> REFANY <: REFORINT
> SMALLINT = [MinSmallInteger .. MaxSmallInteger]
> SMALLINT <: REFORINT
>
> a REFORINT can be a REFANY or a small integer. It stands to reason
> that
> it is a supertype of REFANY as well as of SMALLINT, because isn't
> that exactly what we're looking for?
>
> The problem with this proposal (as well as both of yours) is that
> you get to go through and inspect all existing code that takes
> REFANY and wonder whether maybe it should take REFORINT instead....
>
> The proposal that Tony and I seem to have agreed on, on the other
> hand, allows the implementation to hide small integers completely
> transparently in the already existing REFANY, at the cost of an LSB
> check on a number of operations, such as ISTYPE, TYPECASE, etc.,
> to any REFANY.
>
> I maintain the following:
>
> I. Since hiding integers in REFANY is intended to be entirely
> transparent, you could have it as a compile or even run-time
> option. Yes, ok this is ugly.
>
> II.Assume we go with REFORINT, or OPAQUE, or TAGGED. What existing
> uses of REFANY, in any code of importance, is going to stay
> REFANY, and what uses will we want to convert to the new types?
>
> I would suggest considering the following argument:
>
> A. The new types are strictly more useful than REFANY; they can
> represent more values because they can represent all of REFANY
> plus other values (after all that was the point all along).
>
> B. Existing uses of REFANY are either
> 1. uses of REFANY because the code is intended to be very generic.
> Programmers will want to migrate such code to the new type,
> so that they can process the new type as well (since the code
> is
> generic it should work for either).
> 2. uses of REFANY because the programmer was too lazy to
> instantiate
> a generic for his specific type and decided to cast the values
> to REFANY (e.g., to insert in a RefList) and back. Such
> programmers
> deserve what they get.
>
> I conclude that code that continues to want to use REFANY rather
> than the new type is code that is not performance sensitive. Hence,
> I conclude that using REFANY directly to represent small integers
> is as reasonable as introducing a new type, and does it with less
> work,
> both in updating the language definition and in updating all the
> libraries in libm3 and elsewhere.
>
> What current uses of REFANY do not fit the cases in II.B.? Some
> trickery to simulate multiple inheritance? Even in those you should
> have created a common supertype, I think.
>
> Mika
>
> "Rodney M. Bates" writes:
>> Here are two integrated proposals for language changes to allow small
>> objects, while preserving the principles of the language. Neither of
>> them changes the semantics of any existing type or burdens any
>> existing code with extra runtime computation.
>>
>> The unsafe proposal:
>>
>> This is a much simpler proposal. It could also be used in more
>> general ways than just using a tag bit to distinguish a true pointer
>> from an in-word value. This is at the cost of requiring unsafe
>> coding
>> techniques in a module implementing the abstraction, which also
>> inevitably means implementation-dependent too. Using it for small
>> objects as discussed still requires the garbage collector not to be
>> confused by trying to trace a reference with a misaligned value.
>>
>> There is a new builtin type OPAQUE. The only things you can do with
>> it in safe code are copy values around (assigment, pass by VALUE,
>> RETURN, etc.) and make reference bindings to it (pass by reference,
>> WITH bindings). Not that this ia a _lot_ more opaque than opaque
>> types.
>>
>> OPAQUE is known to be the same size as Word.T. In unsafe code, you
>> can use this fact to LOOPHOLE it to anything you want and bit-twiddle
>> to your heart's content from there. So an unsafe module could
>> implement procedures that take OPAQUE parameters.
>>
>> Variation:
>>
>> Allow an opaque subtype of OPAQUE, which can be revealed to be any
>> type with statically the same size. This would make it possible to
>> prevent client code from mixing up two different unrelated
>> abstractions that both happened to use OPAQUE. In fact, without
>> this,
>> the proposal really fails to preserve the existing principles of
>> safe/unsafe code. This would require that the revelation be BRANDED,
>> which would either mean restricting the revelation to a reference
>> type
>> (which could still be LOOPHOLEd to anything else) or allowing other
>> types to be BRANDED.
>>
>> Example:
>>
>> INTERFACE ADT;
>> TYPE T <: OPAQUE;
>> PROCEDURE P (x:T);
>> ...
>> END ADT
>>
>> UNSAFE MODULE ADT;
>> TYPE R = RECORD ... END;
>> REVEAL T = BRANDED REF R;
>>
>> PROCEDURE P (x:T)
>> VAR I: INTEGER;
>> BEGIN
>> IF LOOPHOLE(x,INTEGER) MOD 2 # 0
>> THEN
>> I := Word.Shift(LOOPHOLE(x,INTEGER),-1);
>> (* Use integer value I ... *)
>> ELSE
>> (* Use reference value x, according to its declared type.
>> Prior to the check above, we couldn't afford to risk
>> doing anything with it. ... *)
>> END (* IF *);
>> END P;
>> ...
>> END ADT;
>>
>> The safe proposal:
>>
>> This is more complicated and less general, but allows small object
>> abstractions to be done in an entirely safe and implementation-
>> independent way. It abstracts away all the bit-level representation
>> questions.
>>
>> There is a new type TAG, which is a subrange of INTEGER, with
>> implementation-defined bounds (which would then be accessible by
>> FIRST/LAST).
>>
>> There is a new type constructor TAGGED. If T is any traced reference
>> type or any untraced object type, TAGGED T defines a type, called a
>> _tagged_ type, whose value set is the union of the value sets of TAG
>> and T.
>>
>> TAG <: TAGGED T and T <: TAGGED T.
>>
>> There are no other subtype relations involving TAGGED T.
>>
>> Values of a tagged type can be copied and bound-to.
>>
>> The static rules for ISTYPE, NARROW, TYPECASE, and assignability of a
>> type to a type are liberalized to allow their application to a tagged
>> type, and to allow the type to be checked/converted-to to be any
>> subtype of the tagged type. Already existing rules would require a
>> runtime check that the value is a member of the stated type.
>>
>> No other operations are possible on a tagged type. A tagged type is
>> not a reference type and does not have a typecode.
>>
>> Example:
>>
>> INTERFACE ADT;
>> TYPE Op <: REFANY;
>> TYPE T = TAGGED Op;
>> PROCEDURE P (x:T);
>> ...
>> END ADT
>>
>> MODULE ADT;
>> TYPE R = RECORD ... END;
>> REVEAL Op = BRANDED REF R;
>>
>> PROCEDURE P (x:T)
>> BEGIN
>> TYPECASE x
>> OF TAG (I)
>> => (* Use integer value I ... *)
>> | Op (RR)
>> => (* Use reference value RR ... *)
>> END (* IF *);
>> END P;
>> ...
>> END ADT;
>>
>> Note: TAG is like CARDINAL in that it is "just like" a subrange of
>> INTEGER, but not equal to that subrange. This enabled pickles to
>> change the size of values of type TAG and of tagged types, as it now
>> does with CARDINAL.
>>
>> Variation:
>>
>> The 4 runtime-checking operations can also be applied to any ordinal
>> type and can check that an ordinal or tagged type has a value
>> belonging to any subrange of the base type (or maybe just of the type
>> being narrowed). This is pure fluff (well, maybe it is syntactic
>> sugar), but adds some consistency and has a real use:
>>
>> How many times have I coded the following?
>>
>> IF FIRST(SomeOrdinalType <= x AND x <= LAST(SomeOrdinalType) THEN ...
>>
>> or worse,
>>
>> IF FIRST(SomeOrdinalType <= F(x) AND F(x) <= LAST(SomeOrdinalType)
>> THEN ...
>>
>> which I then feel morally obligated to recode using a local
>> variable or
>> WITH-temp to eliminate the duplicated calls F(x).
>>
>> It could then become:
>>
>> IF ISTYPE (F(x), SomeOrdinalType) THEN ...
>>
>>
>>
>>
>>
>>
More information about the M3devel
mailing list