[M3devel] small objects

Mika Nystrom mika at async.caltech.edu
Mon Mar 30 05:33:11 CEST 2009


Ah you are thinking of doing it properly.

I was thinking.. "hack".  Make TYPECASE, TYPECODE, ISTYPE, and NARROW
abort if a REFANY holds a TaggedThing.T

Make TaggedThing.IsT test if it is a TaggedThing.T

TaggedThing.ToInt TaggedThing.T -> [-Min .. Max]

TaggedThing.FromInt [-Min .. Max] -> TaggedThing.T

REVEAL only TaggedThing.T <: REFANY

I think that accomplishes everything.  No?  In a non-UNSAFE
environment, you can't dereference it or (successfully) cast it to
anything unless you have a revelation of more than REFANY, anyhow.

If I understand your idea correctly, I think it really only involves
adding an ISTYPE(x,TAGGED) case.  There's no TYPECODE and there might
be some extra complication in adding it to TYPECASE since it's not
a REF...

Since you might as well handle ISTYPE(x,TAGGED) with TaggedThing.IsT,
do you really need anything more than I propose?  (I.e., no changes
to the language, just a new library interface.)

Actually maybe I do mean for it to be = REFANY, not <:.  Then you
can't NARROW it to anything other than REFANY anyhow.

ISTYPE needn't actually abort, it could just return FALSE for
all cases except REFANY.

Leaves only Pickles, ... or am I missing something?

     Mika

Tony Hosking writes:
>Yes, I've just been thinking about this and trying to come up with a  
>proposal, somewhat along the lines you describe.  My real concern is  
>that you don't want any of the reference operations (including down- 
>casts) to apply to the tagged value type.  But you do want the  
>collector to be aware of it.  We'd need a new builtin type like TAGGED  
>or something and operations for extracting the values.
>
>A slightly wacky alternative might allow NULL to have values other  
>than NIL.  Doing this would require smartening up the compiler and run- 
>time to allow for non-NIL NULL values.  This would let a REFANY hold  
>any number of different NULL values.  Unfortunately, the  
>representation of NIL as 0 and other NULL values using a tag bit  
>doesn't make for an easy catch-all test for NULL.
>
>On 30 Mar 2009, at 12:48, Mika Nystrom wrote:
>
>> Hmmmmm... in my UNSAFE INTERFACE I declare my special type as
>> REFANY (see the URL I sent a few messages ago).
>>
>> Since you can't deref REFANY, I'm thinking that... it's mainly a
>> problem with TYPECASE, TYPECODE, and ISTYPE...?  (In safe code,
>> that is.)
>>
>> Also the GC would have to know that refs that are # 0 (mod 4) aren't
>> real references and not try to follow those.  As long as they are
>> on the stack or in registers there's not much you can do...
>>
>> I don't think this is all that difficult.  My example code has a very
>> simple API of the kind you describe.  See SmallInteger.[im]3...
>>
>>     Mika
>>
>> Tony Hosking writes:
>>> Sorry, yes, I am not awake yet this morning.  Need more coffeee. Of
>>> course this occurs even for all untagged values.
>>>
>>> The main problem is that it would be dangerous generally to allow
>>> reference fields to contain tagged values, since then even safe code
>>> could try to dereference what would amount to actually being a tagged
>>> value non-reference.  What we really need is a new type "tagged
>>> reference" distinct from normal references with associated API to
>>> extract the reference/value it holds.  The compiler would need to
>>> generate heap maps that include these for processing by the  
>>> collector,
>>> just as it does for ordinary references.
>>>
>>> On 30 Mar 2009, at 10:49, Mika Nystrom wrote:
>>>
>>>> Tony,
>>>>
>>>> Doesn't this already happen with INTEGER, REAL, LONGREAL, etc.,
>>>> objects?
>>>>
>>>>  Mika
>>>>
>>>> Tony Hosking writes:
>>>>> If we could accurately type values in the stack/registers at run  
>>>>> time
>>>>> then this would not be a problem.  Unfortunately, the compiler does
>>>>> not do this, so it is possible for a derived pointer (reference +
>>>>> offset) to be formed in stack/registers that the garbage collector
>>>>> won't be able to distinguish between one of your tagged values and
>>>>> some derived pointer into the middle of an object.  If we could
>>>>> assume
>>>>> that the heap never allocates from some known set of addresses then
>>>>> we
>>>>> could safely distinguish the tagged values.
>>>>>
>>>>> On 30 Mar 2009, at 06:10, hendrik at topoi.pooq.com wrote:
>>>>>
>>>>>> There are many times I want to express data which could be
>>>>>> efficiently
>>>>>> coded as the disjoing union of (small) integer and pointer to
>>>>>> object.
>>>>>> The pointer-to-object is used in the case where tho objects are  
>>>>>> big;
>>>>>> the (small) integer for the more common case where the objects are
>>>>>> small.
>>>>>>
>>>>>> High-level languages seem to pe quite paranoid about admitting  
>>>>>> thise
>>>>>> kind of data into the fold, except maybe for Lisp systems, which
>>>>>> have
>>>>>> been doing this from time immemorial.  (I believe CAML does this,
>>>>>> too).
>>>>>> These languages use it internally, and manage to (mostly) hide it
>>>>>> from
>>>>>> the user.
>>>>>>
>>>>>> The X toolkit uses this trick too -- there's a constant somewhere,
>>>>>> and
>>>>>> if an integer is less than this constant, it's passed to an X
>>>>>> toolkit
>>>>>> function as an integer; otherwise by reference.  The idea there is
>>>>>> that
>>>>>> there's a range of addresses of storage that can never be used as
>>>>>> parameters for the X toolkit functions (presumably because of
>>>>>> hardware
>>>>>> or OS limitations), and that the bit patterns that are unavailable
>>>>>> for
>>>>>> addresses can be used as small integers.
>>>>>>
>>>>>> Now the semantics of such a union, efficiently coded, are quite
>>>>>> clear.
>>>>>> There's a range of numbers that can be packed unamiguously into
>>>>>> pointers, and if your integer can be so packed, you do it;
>>>>>> otherwise you use a reference to sime kind of INTEGER object
>>>>>> elsewhere.  There are operations for packing integers and object
>>>>>> pointers into such words, and others for unpacking them (complete
>>>>>> with
>>>>>> type-test).  The actual physical representation can be machine- or
>>>>>> implemetation dependent -- you could do a bit of shifting and pack
>>>>>> integers into words with the low bit set (if pointers to objects  
>>>>>> are
>>>>>> usually aligned in some way, the integers will stand out as being
>>>>>> unalinged)  Or you could use an uppoer bound on "small" integers,
>>>>>> as C
>>>>>> does.  And on a machine where such packing is impossible (for
>>>>>> whatever
>>>>>> reason) you could simply set the upper bound of (the absolute alue
>>>>>> of) such packable integers to be zero, so there wouldn't be any.
>>>>>>
>>>>>> Is there any way such a thing can be done in Modula 3?  Remember  
>>>>>> --
>>>>>> I do
>>>>>> want the garbage collector to be aware of such conventions and do
>>>>>> proper
>>>>>> tracing on the pointers?
>>>>>>
>>>>>> (I suspect the answer is "no".  But would be a pity.)
>>>>>>
>>>>>> -- hendrik
>>>
>



More information about the M3devel mailing list