[M3devel] small objects

Tony Hosking hosking at cs.purdue.edu
Thu Apr 9 05:01:13 CEST 2009


Sounds like you want something like:

TAGGED REFANY FOR T

where T must be a type that fits into BITS(REFANY)-1?

Branding could be used to prevent mixing otherwise structurally  
equivalent TAGGED REFANY.

TAGGED BRANDED REFANY FOR T

Where this breaks down is what are the subtyping rules, since I assume  
you'd like to store these in a REFANY and dynamically test for the  
appropriate tagged type:

TAGGED REFANY FOR T <: REFANY

But then how do we distinguish all the different TAGGED REFANY from  
each other at run-time?

On 9 Apr 2009, at 12:13, Rodney M. Bates wrote:

> Mika Nystrom wrote:
>> Hmm, ok, there's one big difference between what you're saying and
>> what Tony and I have been talking about.  (I think your understanding
>> sounds pretty correct.)
>>
>> You want to do objects other than small integers.  Like what?  And  
>> why?
>> I was thinking the trick would apply only to one specific type of  
>> integer.
>>
>
> Yes, I want a language mechanism that can be used by various
> modules to implement various abstract data types, each of which
> can perform the sometimes dramatic space optimization of putting
> values that are common and will fit directly in the word.
> One module I spoke of implements general sets of integers with
> dynamically variable and sometimes large range.   This differs from
> the builtin SET OF..,  which must have a static (and probably  
> relatively
> small) base subrange.  Thus the general, heap-allocated values contain
> open arrays of Word.T,  treated as sets, or if you prefer, packed  
> arrays
> of booleans, although I manipulate them with bit-twiddling operations
> from Word.
> There is another, static-sized, heap-allocated object in front of  
> each array,
> containing biases on what bits correspond to what integers in the
> abstract set, etc.   It all works fine now, but the usage pattern of  
> some
> clients has a high percentage very small sets that would fit in a  
> word,
> and there would be an 11-to-1 space savings if that could be done.
> BTW, there are also two different kinds of heap objects, one that
> represents a range set by just its bounds.  So I am TYPECASEing
> these already.  It would be very convenient if I could just add  
> another
> alternative to the TYPECASE for in-word values.
>
> In another case, I need truly dynamically variable sized arrays of
> integers in [0..15], and the great majority are 7 elements or less,
> which would fit directly in the word, but I still the need full  
> generality
> to be available, so it's open arrays all the time, with three  
> additional
> words each.
>
> If you can pack a union of a pointer and an integer into a word and
> can separate them with runtime checks, then you can use the
> separated integer any way you want, with bit twiddling, type  
> conversions,
> LOOPHOLE, or whatever.   That is what I am trying to get, not just
> Smalltalk-like integers.
> Note that Smalltalk has zero static typing, so only one internal
> representation must do for the union of all possible values in
> the language.  In Modula-3, it would be very inconsistent with
> the language's philosophy to be this restricted.
>> Hmm, so your idea is to statically determine what type the references
>> can have if they are non-references.  So you are thinking to put  
>> various
>> kinds of subranges into the "TAGGED" types.  But you have to be  
>> able to
>> determine, statically, which subrange it is... am I understanding  
>> this
>> correctly?
>>
>
> From the language, all I want is to be able to dynamically determine
> whether it is a true pointer to a heap object or a value stored
> directly in the word, while preserving the safety principles and
> the semantics of everything already there.   So I want some new
> types, different from any existing types, that statically are known
> to hold this kind of valueset union and can be converted/assigned
> to a variable of existing type that is statically known to be either a
> pointer or an integer (but not both), with a suitable runtime check.
> It is also necessary to have a way to do this without risking a  
> runtime
> error, if your code doesn't know yet which kind of value it has.
> Various ADT modules can take it from there.
>>    Mika
>>
>> "Rodney M. Bates" writes:
>>
>>> Tony Hosking wrote:
>>>
>>>> On 8 Apr 2009, at 11:49, Rodney M. Bates wrote:
>>>>
>>>>
>>>>> Mika Nystrom wrote:
>>>>>
>>>>>> Hendrik, I think Tony's and my arguments that you can't break any
>>>>>> existing code by allowing the squirreling away of integers into
>>>>>> REFANYs are pretty solid.  Pre-existing code simply can't do  
>>>>>> anything
>>>>>> useful with unrevealed REFANYs.
>>>>>>
>>>>> This is only true of _unrevealed  opaque subtypes_ of REFANY,
>>>>> not of REFANY itself.   There is lots of existing code that uses  
>>>>> REFANY,
>>>>> and there, ISTYPE, NARROW, TYPECASE, and assigment can be and
>>>>> regularly are used on it.   It is essential not to alter the  
>>>>> semantics there.
>>>>>
>>>> Pre-existing code won't be able to do anything useful with tagged  
>>>> REFANYs:
>>>>
>>>> Suppose we have
>>>>
>>>> VAR r: REFANY = SmallInteger.FromInt(0);
>>>>
>>>> then
>>>>
>>>> ISTYPE(r, REFANY) => TRUE
>>>> ISTYPE(r, T) => FALSE for any T # REFANY
>>>>
>>>> Similarly, for TYPECASE, r will only trigger the REFANY branch.
>>>>
>>>> NARROW(r, REFANY) => r
>>>> NARROW(r, T) => run-time error for any T #REFANY
>>>>
>>>> VAR x: REFANY  => assignment succeeds
>>>> VAR x: T := r  => run-time error for any T # REFANY (because of  
>>>> implicit NARROW)
>>>>
>>> I think I am getting a bit lost in all the proposals, variations,  
>>> counterproposals, etc., but
>>>
>> >from this argument I am inferring that your plan is that only  
>> variables
>>> declared REFANY
>>> and not any proper subtype of REFANY can ever have a value with a  
>>> tag bit set?   Then
>>> the 4 narrowing operations, when and only when applied to an  
>>> expression of static
>>> type REFANY, would change to make a runtime check for a tag bit  
>>> and fail if it's set?
>>> It would take this to prevent a tagged value from getting into a  
>>> variable declared a
>>> proper subtype of REFANY, which can be dereferenced.
>>>
>>> This would preclude making your abstract data type an opaque  
>>> subtype of REFANY,
>>> and would mean all supposedly unrelated ADT modules that used the  
>>> tag technique
>>> could be broken by client code that mixed up the REFANY values of  
>>> one of them with
>>> those of another.  I consider this a definite breach of Modula-3's  
>>> otherwise bulletproof
>>> type safety.
>>>> It is impossible to dereference an expression statically typed as  
>>>> REFANY, so there is no need for a "tagged" check on dereference.
>>>> Because a tagged REFANY cannot be assigned to anything other than  
>>>> something typed REFANY, it can never propagate to a place where  
>>>> it can be dereferenced.
>>>>
>>>>
>>>>> Aside from actual semantic changes, I agree with Tony that we  
>>>>> should
>>>>> not burden any existing type with additional runtime work.  Even  
>>>>> though
>>>>> I expect small objects to support big performance gains in certain
>>>>> important cases, non-tagged references will still greatly  
>>>>> predominate
>>>>> in most code.   Create new type(s) for tagged references.
>>>>>
>>>> I'm not sure that we are seeing any semantic changes at all.  And  
>>>> with Mika's definition of SmallInteger.T as a "boxed" INTEGER  
>>>> object (actually it would be a subrange for values that fit into  
>>>> BITSIZE(Word.T)-1 bits), it is essentially transparent.  It just  
>>>> happens to be a run-time optimization that unboxes the INTEGER  
>>>> value.
>>>>
>>>>
>>>> I think I can implement the compiler and run-time support for  
>>>> this very quickly.
>>>>
>>>>
>>>>
>>
>>




More information about the M3devel mailing list