[M3devel] small objects

Tony Hosking hosking at cs.purdue.edu
Mon Mar 30 04:33:37 CEST 2009


Yes, I've just been thinking about this and trying to come up with a  
proposal, somewhat along the lines you describe.  My real concern is  
that you don't want any of the reference operations (including down- 
casts) to apply to the tagged value type.  But you do want the  
collector to be aware of it.  We'd need a new builtin type like TAGGED  
or something and operations for extracting the values.

A slightly wacky alternative might allow NULL to have values other  
than NIL.  Doing this would require smartening up the compiler and run- 
time to allow for non-NIL NULL values.  This would let a REFANY hold  
any number of different NULL values.  Unfortunately, the  
representation of NIL as 0 and other NULL values using a tag bit  
doesn't make for an easy catch-all test for NULL.

On 30 Mar 2009, at 12:48, Mika Nystrom wrote:

> Hmmmmm... in my UNSAFE INTERFACE I declare my special type as
> REFANY (see the URL I sent a few messages ago).
>
> Since you can't deref REFANY, I'm thinking that... it's mainly a
> problem with TYPECASE, TYPECODE, and ISTYPE...?  (In safe code,
> that is.)
>
> Also the GC would have to know that refs that are # 0 (mod 4) aren't
> real references and not try to follow those.  As long as they are
> on the stack or in registers there's not much you can do...
>
> I don't think this is all that difficult.  My example code has a very
> simple API of the kind you describe.  See SmallInteger.[im]3...
>
>     Mika
>
> Tony Hosking writes:
>> Sorry, yes, I am not awake yet this morning.  Need more coffeee. Of
>> course this occurs even for all untagged values.
>>
>> The main problem is that it would be dangerous generally to allow
>> reference fields to contain tagged values, since then even safe code
>> could try to dereference what would amount to actually being a tagged
>> value non-reference.  What we really need is a new type "tagged
>> reference" distinct from normal references with associated API to
>> extract the reference/value it holds.  The compiler would need to
>> generate heap maps that include these for processing by the  
>> collector,
>> just as it does for ordinary references.
>>
>> On 30 Mar 2009, at 10:49, Mika Nystrom wrote:
>>
>>> Tony,
>>>
>>> Doesn't this already happen with INTEGER, REAL, LONGREAL, etc.,
>>> objects?
>>>
>>>  Mika
>>>
>>> Tony Hosking writes:
>>>> If we could accurately type values in the stack/registers at run  
>>>> time
>>>> then this would not be a problem.  Unfortunately, the compiler does
>>>> not do this, so it is possible for a derived pointer (reference +
>>>> offset) to be formed in stack/registers that the garbage collector
>>>> won't be able to distinguish between one of your tagged values and
>>>> some derived pointer into the middle of an object.  If we could
>>>> assume
>>>> that the heap never allocates from some known set of addresses then
>>>> we
>>>> could safely distinguish the tagged values.
>>>>
>>>> On 30 Mar 2009, at 06:10, hendrik at topoi.pooq.com wrote:
>>>>
>>>>> There are many times I want to express data which could be
>>>>> efficiently
>>>>> coded as the disjoing union of (small) integer and pointer to
>>>>> object.
>>>>> The pointer-to-object is used in the case where tho objects are  
>>>>> big;
>>>>> the (small) integer for the more common case where the objects are
>>>>> small.
>>>>>
>>>>> High-level languages seem to pe quite paranoid about admitting  
>>>>> thise
>>>>> kind of data into the fold, except maybe for Lisp systems, which
>>>>> have
>>>>> been doing this from time immemorial.  (I believe CAML does this,
>>>>> too).
>>>>> These languages use it internally, and manage to (mostly) hide it
>>>>> from
>>>>> the user.
>>>>>
>>>>> The X toolkit uses this trick too -- there's a constant somewhere,
>>>>> and
>>>>> if an integer is less than this constant, it's passed to an X
>>>>> toolkit
>>>>> function as an integer; otherwise by reference.  The idea there is
>>>>> that
>>>>> there's a range of addresses of storage that can never be used as
>>>>> parameters for the X toolkit functions (presumably because of
>>>>> hardware
>>>>> or OS limitations), and that the bit patterns that are unavailable
>>>>> for
>>>>> addresses can be used as small integers.
>>>>>
>>>>> Now the semantics of such a union, efficiently coded, are quite
>>>>> clear.
>>>>> There's a range of numbers that can be packed unamiguously into
>>>>> pointers, and if your integer can be so packed, you do it;
>>>>> otherwise you use a reference to sime kind of INTEGER object
>>>>> elsewhere.  There are operations for packing integers and object
>>>>> pointers into such words, and others for unpacking them (complete
>>>>> with
>>>>> type-test).  The actual physical representation can be machine- or
>>>>> implemetation dependent -- you could do a bit of shifting and pack
>>>>> integers into words with the low bit set (if pointers to objects  
>>>>> are
>>>>> usually aligned in some way, the integers will stand out as being
>>>>> unalinged)  Or you could use an uppoer bound on "small" integers,
>>>>> as C
>>>>> does.  And on a machine where such packing is impossible (for
>>>>> whatever
>>>>> reason) you could simply set the upper bound of (the absolute alue
>>>>> of) such packable integers to be zero, so there wouldn't be any.
>>>>>
>>>>> Is there any way such a thing can be done in Modula 3?  Remember  
>>>>> --
>>>>> I do
>>>>> want the garbage collector to be aware of such conventions and do
>>>>> proper
>>>>> tracing on the pointers?
>>>>>
>>>>> (I suspect the answer is "no".  But would be a pity.)
>>>>>
>>>>> -- hendrik
>>





More information about the M3devel mailing list