[M3devel] small objects

Jay jay.krell at cornell.edu
Thu Apr 9 05:59:02 CEST 2009


Um, what do folks think of, like:

 

struct

{
    void* Type;

    union

    {
       size_t Integer;

       void* Pointer;

     } Value;

} Variant;

 

?

You know -- something that is two pointers, one pointer for the type, one for the value or integer?

void* Type would actually be a pointer to something actually defined and elaborate.

 

Obviously this is twice as large, not as small as it could be, but much more general and portable. No need to determine how many of bits can be the tag.

 

And hope/assume for perf that such a small struct is passed in registers.

On x86/NT 4 and 8 byte structs I think are.

 

Type could also be an integer, index into some table, with some predefined values.

#define TYPE_INTEGER 0

#define TYPE_FLOAT 1

#define TYPE_DOUBLE 2

#define TYPE_ADDRESS 3

 

more generally the union would have a float, and maybe a double on 64bit platforms.

 

OR, on 64bit platforms, you probably can, with some porting work, dedicate a whole 8 bits or so to a type index, and still the thing in one "word". How many bits of address space does any 64bit platform these days or forseeable future actually implement?

 

But 32 bits doesn't seem big enough to afford that, and still this is a portability problem -- anything less than a full pointer.

 

 - Jay

 
> From: hosking at cs.purdue.edu
> To: rodney.m.bates at cox.net
> Date: Thu, 9 Apr 2009 13:01:13 +1000
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] small objects
> 
> Sounds like you want something like:
> 
> TAGGED REFANY FOR T
> 
> where T must be a type that fits into BITS(REFANY)-1?
> 
> Branding could be used to prevent mixing otherwise structurally 
> equivalent TAGGED REFANY.
> 
> TAGGED BRANDED REFANY FOR T
> 
> Where this breaks down is what are the subtyping rules, since I assume 
> you'd like to store these in a REFANY and dynamically test for the 
> appropriate tagged type:
> 
> TAGGED REFANY FOR T <: REFANY
> 
> But then how do we distinguish all the different TAGGED REFANY from 
> each other at run-time?
> 
> On 9 Apr 2009, at 12:13, Rodney M. Bates wrote:
> 
> > Mika Nystrom wrote:
> >> Hmm, ok, there's one big difference between what you're saying and
> >> what Tony and I have been talking about. (I think your understanding
> >> sounds pretty correct.)
> >>
> >> You want to do objects other than small integers. Like what? And 
> >> why?
> >> I was thinking the trick would apply only to one specific type of 
> >> integer.
> >>
> >
> > Yes, I want a language mechanism that can be used by various
> > modules to implement various abstract data types, each of which
> > can perform the sometimes dramatic space optimization of putting
> > values that are common and will fit directly in the word.
> > One module I spoke of implements general sets of integers with
> > dynamically variable and sometimes large range. This differs from
> > the builtin SET OF.., which must have a static (and probably 
> > relatively
> > small) base subrange. Thus the general, heap-allocated values contain
> > open arrays of Word.T, treated as sets, or if you prefer, packed 
> > arrays
> > of booleans, although I manipulate them with bit-twiddling operations
> > from Word.
> > There is another, static-sized, heap-allocated object in front of 
> > each array,
> > containing biases on what bits correspond to what integers in the
> > abstract set, etc. It all works fine now, but the usage pattern of 
> > some
> > clients has a high percentage very small sets that would fit in a 
> > word,
> > and there would be an 11-to-1 space savings if that could be done.
> > BTW, there are also two different kinds of heap objects, one that
> > represents a range set by just its bounds. So I am TYPECASEing
> > these already. It would be very convenient if I could just add 
> > another
> > alternative to the TYPECASE for in-word values.
> >
> > In another case, I need truly dynamically variable sized arrays of
> > integers in [0..15], and the great majority are 7 elements or less,
> > which would fit directly in the word, but I still the need full 
> > generality
> > to be available, so it's open arrays all the time, with three 
> > additional
> > words each.
> >
> > If you can pack a union of a pointer and an integer into a word and
> > can separate them with runtime checks, then you can use the
> > separated integer any way you want, with bit twiddling, type 
> > conversions,
> > LOOPHOLE, or whatever. That is what I am trying to get, not just
> > Smalltalk-like integers.
> > Note that Smalltalk has zero static typing, so only one internal
> > representation must do for the union of all possible values in
> > the language. In Modula-3, it would be very inconsistent with
> > the language's philosophy to be this restricted.
> >> Hmm, so your idea is to statically determine what type the references
> >> can have if they are non-references. So you are thinking to put 
> >> various
> >> kinds of subranges into the "TAGGED" types. But you have to be 
> >> able to
> >> determine, statically, which subrange it is... am I understanding 
> >> this
> >> correctly?
> >>
> >
> > From the language, all I want is to be able to dynamically determine
> > whether it is a true pointer to a heap object or a value stored
> > directly in the word, while preserving the safety principles and
> > the semantics of everything already there. So I want some new
> > types, different from any existing types, that statically are known
> > to hold this kind of valueset union and can be converted/assigned
> > to a variable of existing type that is statically known to be either a
> > pointer or an integer (but not both), with a suitable runtime check.
> > It is also necessary to have a way to do this without risking a 
> > runtime
> > error, if your code doesn't know yet which kind of value it has.
> > Various ADT modules can take it from there.
> >> Mika
> >>
> >> "Rodney M. Bates" writes:
> >>
> >>> Tony Hosking wrote:
> >>>
> >>>> On 8 Apr 2009, at 11:49, Rodney M. Bates wrote:
> >>>>
> >>>>
> >>>>> Mika Nystrom wrote:
> >>>>>
> >>>>>> Hendrik, I think Tony's and my arguments that you can't break any
> >>>>>> existing code by allowing the squirreling away of integers into
> >>>>>> REFANYs are pretty solid. Pre-existing code simply can't do 
> >>>>>> anything
> >>>>>> useful with unrevealed REFANYs.
> >>>>>>
> >>>>> This is only true of _unrevealed opaque subtypes_ of REFANY,
> >>>>> not of REFANY itself. There is lots of existing code that uses 
> >>>>> REFANY,
> >>>>> and there, ISTYPE, NARROW, TYPECASE, and assigment can be and
> >>>>> regularly are used on it. It is essential not to alter the 
> >>>>> semantics there.
> >>>>>
> >>>> Pre-existing code won't be able to do anything useful with tagged 
> >>>> REFANYs:
> >>>>
> >>>> Suppose we have
> >>>>
> >>>> VAR r: REFANY = SmallInteger.FromInt(0);
> >>>>
> >>>> then
> >>>>
> >>>> ISTYPE(r, REFANY) => TRUE
> >>>> ISTYPE(r, T) => FALSE for any T # REFANY
> >>>>
> >>>> Similarly, for TYPECASE, r will only trigger the REFANY branch.
> >>>>
> >>>> NARROW(r, REFANY) => r
> >>>> NARROW(r, T) => run-time error for any T #REFANY
> >>>>
> >>>> VAR x: REFANY => assignment succeeds
> >>>> VAR x: T := r => run-time error for any T # REFANY (because of 
> >>>> implicit NARROW)
> >>>>
> >>> I think I am getting a bit lost in all the proposals, variations, 
> >>> counterproposals, etc., but
> >>>
> >> >from this argument I am inferring that your plan is that only 
> >> variables
> >>> declared REFANY
> >>> and not any proper subtype of REFANY can ever have a value with a 
> >>> tag bit set? Then
> >>> the 4 narrowing operations, when and only when applied to an 
> >>> expression of static
> >>> type REFANY, would change to make a runtime check for a tag bit 
> >>> and fail if it's set?
> >>> It would take this to prevent a tagged value from getting into a 
> >>> variable declared a
> >>> proper subtype of REFANY, which can be dereferenced.
> >>>
> >>> This would preclude making your abstract data type an opaque 
> >>> subtype of REFANY,
> >>> and would mean all supposedly unrelated ADT modules that used the 
> >>> tag technique
> >>> could be broken by client code that mixed up the REFANY values of 
> >>> one of them with
> >>> those of another. I consider this a definite breach of Modula-3's 
> >>> otherwise bulletproof
> >>> type safety.
> >>>> It is impossible to dereference an expression statically typed as 
> >>>> REFANY, so there is no need for a "tagged" check on dereference.
> >>>> Because a tagged REFANY cannot be assigned to anything other than 
> >>>> something typed REFANY, it can never propagate to a place where 
> >>>> it can be dereferenced.
> >>>>
> >>>>
> >>>>> Aside from actual semantic changes, I agree with Tony that we 
> >>>>> should
> >>>>> not burden any existing type with additional runtime work. Even 
> >>>>> though
> >>>>> I expect small objects to support big performance gains in certain
> >>>>> important cases, non-tagged references will still greatly 
> >>>>> predominate
> >>>>> in most code. Create new type(s) for tagged references.
> >>>>>
> >>>> I'm not sure that we are seeing any semantic changes at all. And 
> >>>> with Mika's definition of SmallInteger.T as a "boxed" INTEGER 
> >>>> object (actually it would be a subrange for values that fit into 
> >>>> BITSIZE(Word.T)-1 bits), it is essentially transparent. It just 
> >>>> happens to be a run-time optimization that unboxes the INTEGER 
> >>>> value.
> >>>>
> >>>>
> >>>> I think I can implement the compiler and run-time support for 
> >>>> this very quickly.
> >>>>
> >>>>
> >>>>
> >>
> >>
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20090409/4dceea84/attachment-0002.html>


More information about the M3devel mailing list