[M3devel] small objects

Wed Apr 8 22:08:09 CEST 2009

Here are two integrated proposals for language changes to allow small
objects, while preserving the principles of the language.  Neither of
them changes the semantics of any existing type or burdens any
existing code with extra runtime computation.

The unsafe proposal:

This is a much simpler proposal.  It could also be used in more
general ways than just using a tag bit to distinguish a true pointer
from an in-word value.  This is at the cost of requiring unsafe coding
techniques in a module implementing the abstraction, which also
inevitably means implementation-dependent too.  Using it for small
objects as discussed still requires the garbage collector not to be
confused by trying to trace a reference with a misaligned value.

There is a new builtin type OPAQUE.  The only things you can do with
it in safe code are copy values around (assigment, pass by VALUE,
RETURN, etc.) and make reference bindings to it (pass by reference,
WITH bindings).  Not that this ia a _lot_ more opaque than opaque
types.

OPAQUE is known to be the same size as Word.T.  In unsafe code, you
can use this fact to LOOPHOLE it to anything you want and bit-twiddle
to your heart's content from there.  So an unsafe module could
implement procedures that take OPAQUE parameters.

Variation:  

Allow an opaque subtype of OPAQUE, which can be revealed to be any
type with statically the same size.  This would make it possible to
prevent client code from mixing up two different unrelated
abstractions that both happened to use OPAQUE.  In fact, without this,
the proposal really fails to preserve the existing principles of
safe/unsafe code.  This would require that the revelation be BRANDED,
which would either mean restricting the revelation to a reference type
(which could still be LOOPHOLEd to anything else) or allowing other
types to be BRANDED.

Example:

INTERFACE ADT;
  TYPE T <: OPAQUE;
  PROCEDURE P (x:T);
  ...
END ADT

UNSAFE MODULE ADT;
  TYPE R = RECORD ... END;
  REVEAL T = BRANDED REF R;

  PROCEDURE P (x:T)
    VAR I: INTEGER;
    BEGIN
      IF LOOPHOLE(x,INTEGER) MOD 2 # 0
      THEN
        I := Word.Shift(LOOPHOLE(x,INTEGER),-1);
        (* Use integer value I ... *)
      ELSE
        (* Use reference value x, according to its declared type.
           Prior to the check above, we couldn't afford to risk
           doing anything with it. ... *)
      END (* IF *);
    END P;
  ...
END ADT;  

The safe proposal:

This is more complicated and less general, but allows small object
abstractions to be done in an entirely safe and implementation-
independent way.  It abstracts away all the bit-level representation
questions.

There is a new type TAG, which is a subrange of INTEGER, with
implementation-defined bounds (which would then be accessible by
FIRST/LAST).

There is a new type constructor TAGGED.  If T is any traced reference
type or any untraced object type, TAGGED T defines a type, called a
_tagged_ type, whose value set is the union of the value sets of TAG
and T.

TAG <: TAGGED T and T <: TAGGED T.

There are no other subtype relations involving TAGGED T.

Values of a tagged type can be copied and bound-to.

The static rules for ISTYPE, NARROW, TYPECASE, and assignability of a
type to a type are liberalized to allow their application to a tagged
type, and to allow the type to be checked/converted-to to be any
subtype of the tagged type.  Already existing rules would require a
runtime check that the value is a member of the stated type.

No other operations are possible on a tagged type.  A tagged type is
not a reference type and does not have a typecode.

Example:

INTERFACE ADT;
  TYPE Op <: REFANY;
  TYPE T = TAGGED Op;
  PROCEDURE P (x:T);
  ...
END ADT

MODULE ADT;
  TYPE R = RECORD ... END;
  REVEAL Op = BRANDED REF R;

  PROCEDURE P (x:T)
    BEGIN
      TYPECASE x
      OF TAG (I)
      => (* Use integer value I ... *)
      | Op (RR)
      => (* Use reference value RR ... *)
      END (* IF *);
    END P;
  ...
END ADT;

Note: TAG is like CARDINAL in that it is "just like" a subrange of
INTEGER, but not equal to that subrange.  This enabled pickles to
change the size of values of type TAG and of tagged types, as it now
does with CARDINAL.

Variation:

The 4 runtime-checking operations can also be applied to any ordinal
type and can check that an ordinal or tagged type has a value
belonging to any subrange of the base type (or maybe just of the type
being narrowed).  This is pure fluff (well, maybe it is syntactic
sugar), but adds some consistency and has a real use:

How many times have I coded the following?

IF FIRST(SomeOrdinalType <= x AND x <= LAST(SomeOrdinalType) THEN ...

or worse,

IF FIRST(SomeOrdinalType <= F(x) AND F(x) <= LAST(SomeOrdinalType) THEN ...

which I then feel morally obligated to recode using a local variable or
WITH-temp to eliminate the duplicated calls F(x).

It could then become:

IF ISTYPE (F(x), SomeOrdinalType) THEN ...