[M3devel] small objects
Tony Hosking
hosking at cs.purdue.edu
Mon Mar 30 04:33:37 CEST 2009
Yes, I've just been thinking about this and trying to come up with a
proposal, somewhat along the lines you describe. My real concern is
that you don't want any of the reference operations (including down-
casts) to apply to the tagged value type. But you do want the
collector to be aware of it. We'd need a new builtin type like TAGGED
or something and operations for extracting the values.
A slightly wacky alternative might allow NULL to have values other
than NIL. Doing this would require smartening up the compiler and run-
time to allow for non-NIL NULL values. This would let a REFANY hold
any number of different NULL values. Unfortunately, the
representation of NIL as 0 and other NULL values using a tag bit
doesn't make for an easy catch-all test for NULL.
On 30 Mar 2009, at 12:48, Mika Nystrom wrote:
> Hmmmmm... in my UNSAFE INTERFACE I declare my special type as
> REFANY (see the URL I sent a few messages ago).
>
> Since you can't deref REFANY, I'm thinking that... it's mainly a
> problem with TYPECASE, TYPECODE, and ISTYPE...? (In safe code,
> that is.)
>
> Also the GC would have to know that refs that are # 0 (mod 4) aren't
> real references and not try to follow those. As long as they are
> on the stack or in registers there's not much you can do...
>
> I don't think this is all that difficult. My example code has a very
> simple API of the kind you describe. See SmallInteger.[im]3...
>
> Mika
>
> Tony Hosking writes:
>> Sorry, yes, I am not awake yet this morning. Need more coffeee. Of
>> course this occurs even for all untagged values.
>>
>> The main problem is that it would be dangerous generally to allow
>> reference fields to contain tagged values, since then even safe code
>> could try to dereference what would amount to actually being a tagged
>> value non-reference. What we really need is a new type "tagged
>> reference" distinct from normal references with associated API to
>> extract the reference/value it holds. The compiler would need to
>> generate heap maps that include these for processing by the
>> collector,
>> just as it does for ordinary references.
>>
>> On 30 Mar 2009, at 10:49, Mika Nystrom wrote:
>>
>>> Tony,
>>>
>>> Doesn't this already happen with INTEGER, REAL, LONGREAL, etc.,
>>> objects?
>>>
>>> Mika
>>>
>>> Tony Hosking writes:
>>>> If we could accurately type values in the stack/registers at run
>>>> time
>>>> then this would not be a problem. Unfortunately, the compiler does
>>>> not do this, so it is possible for a derived pointer (reference +
>>>> offset) to be formed in stack/registers that the garbage collector
>>>> won't be able to distinguish between one of your tagged values and
>>>> some derived pointer into the middle of an object. If we could
>>>> assume
>>>> that the heap never allocates from some known set of addresses then
>>>> we
>>>> could safely distinguish the tagged values.
>>>>
>>>> On 30 Mar 2009, at 06:10, hendrik at topoi.pooq.com wrote:
>>>>
>>>>> There are many times I want to express data which could be
>>>>> efficiently
>>>>> coded as the disjoing union of (small) integer and pointer to
>>>>> object.
>>>>> The pointer-to-object is used in the case where tho objects are
>>>>> big;
>>>>> the (small) integer for the more common case where the objects are
>>>>> small.
>>>>>
>>>>> High-level languages seem to pe quite paranoid about admitting
>>>>> thise
>>>>> kind of data into the fold, except maybe for Lisp systems, which
>>>>> have
>>>>> been doing this from time immemorial. (I believe CAML does this,
>>>>> too).
>>>>> These languages use it internally, and manage to (mostly) hide it
>>>>> from
>>>>> the user.
>>>>>
>>>>> The X toolkit uses this trick too -- there's a constant somewhere,
>>>>> and
>>>>> if an integer is less than this constant, it's passed to an X
>>>>> toolkit
>>>>> function as an integer; otherwise by reference. The idea there is
>>>>> that
>>>>> there's a range of addresses of storage that can never be used as
>>>>> parameters for the X toolkit functions (presumably because of
>>>>> hardware
>>>>> or OS limitations), and that the bit patterns that are unavailable
>>>>> for
>>>>> addresses can be used as small integers.
>>>>>
>>>>> Now the semantics of such a union, efficiently coded, are quite
>>>>> clear.
>>>>> There's a range of numbers that can be packed unamiguously into
>>>>> pointers, and if your integer can be so packed, you do it;
>>>>> otherwise you use a reference to sime kind of INTEGER object
>>>>> elsewhere. There are operations for packing integers and object
>>>>> pointers into such words, and others for unpacking them (complete
>>>>> with
>>>>> type-test). The actual physical representation can be machine- or
>>>>> implemetation dependent -- you could do a bit of shifting and pack
>>>>> integers into words with the low bit set (if pointers to objects
>>>>> are
>>>>> usually aligned in some way, the integers will stand out as being
>>>>> unalinged) Or you could use an uppoer bound on "small" integers,
>>>>> as C
>>>>> does. And on a machine where such packing is impossible (for
>>>>> whatever
>>>>> reason) you could simply set the upper bound of (the absolute alue
>>>>> of) such packable integers to be zero, so there wouldn't be any.
>>>>>
>>>>> Is there any way such a thing can be done in Modula 3? Remember
>>>>> --
>>>>> I do
>>>>> want the garbage collector to be aware of such conventions and do
>>>>> proper
>>>>> tracing on the pointers?
>>>>>
>>>>> (I suspect the answer is "no". But would be a pity.)
>>>>>
>>>>> -- hendrik
>>
More information about the M3devel
mailing list