[M3devel] small objects

Tony Hosking hosking at cs.purdue.edu
Tue Apr 7 08:27:41 CEST 2009


No, you are right on the money.  I was just trying to avoid attempts  
to dereference tagged integers with a run-time check like a nil-check.

I am starting to come around to your library idea.  As you say,  
TaggedInteger.T would give a static error on dereference, so no need  
for a run-time check.  Similarly, because one cannot NARROW  
(implicitly or explicitly) a TaggedInteger.T it is impossible to  
assign to anything other than a REFANY which avoids polluting other  
reference fields and so eliminates my initial concern.   Hmmm....  ;-)

On 7 Apr 2009, at 15:28, Mika Nystrom wrote:

>
> Ok, I see.  You've done this before!  Then I think you are qualified
> to say that it's doable and straightforward.  It certainly seems all
> right from what you say.  It sounds like it's implementation-wise not
> very different from the library idea I was proposing, but much more
> elegant.
>
> But but... this NULL business.  You don't think that we who want to
> use the TAGGED INTEGER deserve to pay an extra instruction for the
> convenience?  Is there actually an instruction checking for NULL?
> You don't just let the system segfault the process?  Is this because
> you might have an indexed pointer that might be persuaded to point  
> into
> mapped memory even if 0 is an illegal address?  (Could you insert the
> NULL check only for pointers to types that are larger than a certain
> size, say one page?)
>
> It seems a little inconvenient to me not to have a pointer that's
> NIL.  What you suggest doesn't really work super-elegantly...
>
> I'm thinking: in my Scheme interpreter, I represent Scheme types as
> Modula-3 REFs (following Norvig's JScheme almost exactly).  The
> empty list is just Modula-3's NIL.
>
> But what you're saying is that some integer should be selected as
> "special" to denote the empty list, the non-object, or whatever you
> want to call it.  What if I want to represent that integer?
>
> In any case, the all-zeros bit pattern would not be a legal TAGGED
> type, would it, under your proposal?
>
> Or am I completely missing something here?
>
>    Mika
>
> Tony Hosking writes:
>> On 7 Apr 2009, at 10:42, Mika Nystrom wrote:
>>
>>> You mean this proposal (you had two, I think)?
>>
>> Yes, this is the proposal I converged on.  It really is only minimal
>> change to the type system: introduction of a third hierarchy of
>> TAGGED reference types.  I did something similar for orthogonal
>> persistence a few years ago where it was useful to have a TRANSIENT
>> reference hierarchy for things that should not persist.  [The
>> semantics was that every run of a program would initialize TRANSIENT
>> references to NIL rather than have them have the persistent value.
>> This was slightly messier in that I also permitted TRANSIENT REFANY  
>> <:
>> REFANY, which was a little odd.]
>>
>>> --
>>>
>>> NULL <: REF T <: REFANY
>>> NULL <: UNTRACED REF T <: ADDRESS
>>> TAGGED INTEGER <: TAGGED REF T <: TAGGED REFANY
>>>
>>> Note that NULL is not a subtype of TAGGED REF T or TAGGED REFANY.
>>>
>>> ROOT <: REFANY
>>> TAGGED ROOT <: TAGGED REFANY
>>> NULL <: T OBJECT END <: T where T <: REFANY or T <: ADDRESS
>>> TAGGED INTEGER <: T OBJECT ... END <: T where T <: TAGGED REFANY
>>>
>>> Note that NULL is not a subtype of T OBJECT ... END where T <:  
>>> TAGGED
>>> REFANY.
>>>
>>> ---
>>>
>>> I like it fine, I think... I'm just worried, generally, about
>>> changing the type system.  Hmmmm... you mean that the TAGGED
>>> things would just be a separate hierarchy?
>>
>> The change is relatively minor in that it doesn't touch the original
>> REFANY hierarchy.
>>
>>> So you'd just add TAGGED in front of the REFs for this new  
>>> hierarchy.
>>> But why no NULL?
>>
>> No NULL because then we need a test for both NIL and values of type
>> TAGGED INTEGER on every dereference.  You can easily declare:
>>
>> TYPE Null = TAGGED INTEGER;
>> CONST Nil = VAL(0, TAGGED INTEGER);
>>
>> The point is that TAGGED INTEGER now lets you have a range of non-
>> pointer reference values, whereas NIL is the singleton non-reference
>> value in type NULL.
>>
>>> And you're comfortable with doing the conversions automatically?
>>
>> No, I said that the conversions needed to be explicit
>>
>>>
>>> So
>>>
>>> tx : TAGGED INTEGER := 5;
>>
>> tx: TAGGED INTEGER := VAL(5,TAGGED INTEGER);
>>
>>> x  : INTEGER := tx;
>>
>> x: INTEGER := ORD(tx);
>>
>>> would be OK?
>>>
>>> In that case it's just the m3tk that needs to be modified, and
>>> that's just busy work, I think.  If it works it's beautiful...
>>
>> As I said, I am almost done with the compiler changes, and only need
>> to push it into the run-time.  Things like m3tk and pickles will need
>> to be smartened up as necessary.
>>
>>>
>>>
>>>    Mika
>>>
>>>
>>>
>>> Tony Hosking writes:
>>>> I really don't like your proposal for the reasons you mention.  It
>>>> makes regular REF more expensive than it currently is.  What is it
>>>> about my relatively minor changes to the type system that you  
>>>> object
>>>> to?  I've just about finished changing the compiler front-end (in
>>>> relatively minor ways) to accomodate the proposal I made yesterday.
>>>> And it has the advantage of separating REF from TAGGED REF so we  
>>>> keep
>>>> the standard REF clean.
>>>>
>>>> On 7 Apr 2009, at 00:50, Mika Nystrom wrote:
>>>>
>>>>> Hi Rodney,
>>>>>
>>>>> I would like this capability too, but I am a bit wary of Tony's
>>>>> idea of changing the Modula-3 language---even in a "minor" way.
>>>>> I've been working for the last week or so on an application using
>>>>> the Modula-3 Toolkit, and I must say I have realized that the
>>>> Modula-3 type system has a lot more subtleties than I thought.  I
>>>>> would not want to make any real changes to it.  There's a paper
>>>>> called "The Modula-3 Type System" by Cardelli, Donahue, Kalsow,  
>>>>> and
>>>>> Nelson that is worth studying in detail before making any changes
>>>>> whatsoever, I think.  Also remember that changes to the type  
>>>>> system
>>>>> will affect m3tk and anything that depends on m3tk (which includes
>>>>> two or three stub generators in the main tree and who knows how
>>>>> many dependencies outside of it).
>>>>>
>>>>> I'm still not sure why we can't take the approach of Word.T .
>>>>> Make a RefanyOrInt.T that can safely be:
>>>>>
>>>>> 1. Tested whether it is a small integer or a REFANY
>>>>>
>>>>> 2. If a REFANY, be dereferenced as normal, *or* via
>>>>> RefanyOrInt.ToRefany
>>>>>
>>>>> 3. If a small integer, can be extracted via RefanyOrInt.ToInt
>>>>>
>>>>> 4. Created either as a small integer or a REFANY
>>>>>
>>>>> Any other operations on it (including ISTYPE and TYPECASE, at  
>>>>> least
>>>>> when the object is a small integer) would result in a checked
>>>> runtime
>>>>> error.
>>>>>
>>>>> Note that with the declaration "RefanyOrInt.T = REFANY", the  
>>>>> current
>>>>> compiler will as far as I know not accept any operations on
>>>>> RefanyOrInt.T outside of ISTYPE, TYPECASE, and NARROW (explicit or
>>>>> implicit).
>>>>>
>>>>> I wouldn't be surprised if most of what I'm proposing already  
>>>>> works
>>>>> (i.e., crashes with a checked runtime error as it should) with the
>>>>> current runtime.  Anything that slips through would need to be  
>>>>> fixed
>>>>> up with a one-bit check of the LSB of the REFANY for the  
>>>>> operations
>>>>> mentioned above.  Unfortunately this has to be done for operations
>>>>> on
>>>>> every REFANY, not just the new type.
>>>>>
>>>>> I think that Modula-3 programmers are already aware that using
>>>>> REFANYs involves forcing the runtime to do extra type-checking  
>>>>> work,
>>>>> so they already know not to use them if they are looking for  
>>>>> maximum
>>>>> performance, so I don't think that burdening operations on REFANY
>>>>> with an extra one-bit check is too onerous.
>>>>>
>>>>> An advantage of my proposal is that the amount of code in the new
>>>>> proposed library is truly diminutive.  In fact, I think I posted
>>>>> pretty much that code to the list a few weeks ago.
>>>>>
>>>>> (If you missed it, it's
>>>>>
>>>>> http://www.async.caltech.edu/~mika/interp/
>>>>>
>>>>> )
>>>>>
>>>>>   Mika
>>>>>
>>>>>
>>>>> "Rodney M. Bates" writes:
>>>>>> I spent quite a bit of time playing with a more general system
>>>>>> where
>>>>>> there is a set of "tagged" types, with (an implementation-defined
>>>>>> subrange of) INTEGER and a reference type both being a subtype  
>>>>>> of a
>>>>>> tagged type.  It might have been tolerable, if it weren't for the
>>>>>> interaction with object subtypes and opaque types, which quickly
>>>>>> gets deep into a deep linguistic tar pit.
>>>>>>
>>>>>> Tony's approach is much simpler and would serve what I see as the
>>>>>> need.  It does sacrifice any static checking of what reference  
>>>>>> type
>>>>>> is being tagged, and will also require an extra runtime ISTYPE/
>>>>>> TYPECASE.
>>>>>>
>>>>>> Would INTEGER and REFANY be assignable to TAGGED, or would there
>>>>>> also need to be an explicit conversion in that direction too, say
>>>>>> VAL(x, TAGGED)?  I think I favor the explicit conversion here.   
>>>>>> It
>>>>>> is a bit inconsistent with the usual Modula-3 assignability
>>>>>> philosophy,
>>>>>> but not requiring the explicit conversion already gets your toes
>>>>>> into tar.
>>>>>>
>>>>>> We would have to have something more like ISTYPE, though, which
>>>>>> will
>>>>>> tell which type the value belongs to without risking a runtime
>>>>>> error,
>>>>>> which VAL would do.
>>>>>>
>>>>>> An intermediate approach might be to have a set of tagged types
>>>>>> constructed by, say, TAGGED T, where I is a reference type, but
>>>>>> with no subtype relations on them at all, thus still requiring
>>>>>> the explicit conversions and checks.
>>>>>>
>>>>>> I do want to say, I _really_ want this capability somehow.  I  
>>>>>> have
>>>>>> an ADT module whose internal data structure, for full generality,
>>>>>> needs to have two heap objects (the second because it has  
>>>>>> nonstatic
>>>>>> size) and 11 total words counting the orginal pointer, in the
>>>>>> case of
>>>>>> values that would fit directly  into the "pointer" word.   
>>>>>> Moreover,
>>>>>> these small cases are in the great majority.
>>>>>>
>>>>>> Besides the 11-to-one increase in actual occupied space, there
>>>>>> is extra time for allocation, periodic tracing, and collection,
>>>>>> space
>>>>>> loss due to heap fragmentation, and time/space tradeoff loss  
>>>>>> due to
>>>>>> reduced locality of reference.  So sometimes, it would be a big
>>>>>> performance gain if we could do it.
>>>>>
>>>>>>
>>>>>> Tony Hosking wrote:
>>>>>> It is a much more pervasive hack than I would be comfortable with
>>>>>>> since it touches on the compiler (for all the type operations)  
>>>>>>> as
>>>>>>> well
>>>>>>> as the run-time system in ways that are pretty gross.  I would
>>>>>>> much
>>>>>>> rather have a new TAGGED builtin.  ISTYPE would not work on it
>>>>>>> since
>>>>>>> that only works on references.  The only thing you need is a way
>>>>>>> to
>>>>>>> extract the value -- we could overload VAL(x, T) where x can  
>>>>>>> be a
>>>>>>> TAGGED and T can be REFANY or INTEGER.
>>>>>>>
>>>>>>>




More information about the M3devel mailing list