[M3devel] small objects

Mika Nystrom mika at async.caltech.edu
Tue Apr 7 07:28:32 CEST 2009


Ok, I see.  You've done this before!  Then I think you are qualified
to say that it's doable and straightforward.  It certainly seems all
right from what you say.  It sounds like it's implementation-wise not
very different from the library idea I was proposing, but much more
elegant.

But but... this NULL business.  You don't think that we who want to
use the TAGGED INTEGER deserve to pay an extra instruction for the
convenience?  Is there actually an instruction checking for NULL?
You don't just let the system segfault the process?  Is this because
you might have an indexed pointer that might be persuaded to point into
mapped memory even if 0 is an illegal address?  (Could you insert the
NULL check only for pointers to types that are larger than a certain
size, say one page?)

It seems a little inconvenient to me not to have a pointer that's
NIL.  What you suggest doesn't really work super-elegantly...

I'm thinking: in my Scheme interpreter, I represent Scheme types as
Modula-3 REFs (following Norvig's JScheme almost exactly).  The 
empty list is just Modula-3's NIL.

But what you're saying is that some integer should be selected as
"special" to denote the empty list, the non-object, or whatever you
want to call it.  What if I want to represent that integer?

In any case, the all-zeros bit pattern would not be a legal TAGGED
type, would it, under your proposal?  

Or am I completely missing something here?

    Mika

Tony Hosking writes:
>On 7 Apr 2009, at 10:42, Mika Nystrom wrote:
>
>> You mean this proposal (you had two, I think)?
>
>Yes, this is the proposal I converged on.  It really is only minimal  
>change to the type system: introduction of a third hierarchy of   
>TAGGED reference types.  I did something similar for orthogonal  
>persistence a few years ago where it was useful to have a TRANSIENT  
>reference hierarchy for things that should not persist.  [The  
>semantics was that every run of a program would initialize TRANSIENT  
>references to NIL rather than have them have the persistent value.   
>This was slightly messier in that I also permitted TRANSIENT REFANY <:  
>REFANY, which was a little odd.]
>
>> --
>>
>> NULL <: REF T <: REFANY
>> NULL <: UNTRACED REF T <: ADDRESS
>> TAGGED INTEGER <: TAGGED REF T <: TAGGED REFANY
>>
>> Note that NULL is not a subtype of TAGGED REF T or TAGGED REFANY.
>>
>> ROOT <: REFANY
>> TAGGED ROOT <: TAGGED REFANY
>> NULL <: T OBJECT END <: T where T <: REFANY or T <: ADDRESS
>> TAGGED INTEGER <: T OBJECT ... END <: T where T <: TAGGED REFANY
>>
>> Note that NULL is not a subtype of T OBJECT ... END where T <: TAGGED
>> REFANY.
>>
>> ---
>>
>> I like it fine, I think... I'm just worried, generally, about
>> changing the type system.  Hmmmm... you mean that the TAGGED
>> things would just be a separate hierarchy?
>
>The change is relatively minor in that it doesn't touch the original  
>REFANY hierarchy.
>
>> So you'd just add TAGGED in front of the REFs for this new hierarchy.
>> But why no NULL?
>
>No NULL because then we need a test for both NIL and values of type  
>TAGGED INTEGER on every dereference.  You can easily declare:
>
>TYPE Null = TAGGED INTEGER;
>CONST Nil = VAL(0, TAGGED INTEGER);
>
>The point is that TAGGED INTEGER now lets you have a range of non- 
>pointer reference values, whereas NIL is the singleton non-reference  
>value in type NULL.
>
>> And you're comfortable with doing the conversions automatically?
>
>No, I said that the conversions needed to be explicit
>
>>
>> So
>>
>>  tx : TAGGED INTEGER := 5;
>
>tx: TAGGED INTEGER := VAL(5,TAGGED INTEGER);
>
>>  x  : INTEGER := tx;
>
>x: INTEGER := ORD(tx);
>
>> would be OK?
>>
>> In that case it's just the m3tk that needs to be modified, and
>> that's just busy work, I think.  If it works it's beautiful...
>
>As I said, I am almost done with the compiler changes, and only need  
>to push it into the run-time.  Things like m3tk and pickles will need  
>to be smartened up as necessary.
>
>>
>>
>>     Mika
>>
>>
>>
>> Tony Hosking writes:
>>> I really don't like your proposal for the reasons you mention.  It
>>> makes regular REF more expensive than it currently is.  What is it
>>> about my relatively minor changes to the type system that you object
>>> to?  I've just about finished changing the compiler front-end (in
>>> relatively minor ways) to accomodate the proposal I made yesterday.
>>> And it has the advantage of separating REF from TAGGED REF so we keep
>>> the standard REF clean.
>>>
>>> On 7 Apr 2009, at 00:50, Mika Nystrom wrote:
>>>
>>>> Hi Rodney,
>>>>
>>>> I would like this capability too, but I am a bit wary of Tony's
>>>> idea of changing the Modula-3 language---even in a "minor" way.
>>>> I've been working for the last week or so on an application using
>>>> the Modula-3 Toolkit, and I must say I have realized that the
>>> Modula-3 type system has a lot more subtleties than I thought.  I
>>>> would not want to make any real changes to it.  There's a paper
>>>> called "The Modula-3 Type System" by Cardelli, Donahue, Kalsow, and
>>>> Nelson that is worth studying in detail before making any changes
>>>> whatsoever, I think.  Also remember that changes to the type system
>>>> will affect m3tk and anything that depends on m3tk (which includes
>>>> two or three stub generators in the main tree and who knows how
>>>> many dependencies outside of it).
>>>>
>>>> I'm still not sure why we can't take the approach of Word.T .
>>>> Make a RefanyOrInt.T that can safely be:
>>>>
>>>> 1. Tested whether it is a small integer or a REFANY
>>>>
>>>> 2. If a REFANY, be dereferenced as normal, *or* via
>>>>  RefanyOrInt.ToRefany
>>>>
>>>> 3. If a small integer, can be extracted via RefanyOrInt.ToInt
>>>>
>>>> 4. Created either as a small integer or a REFANY
>>>>
>>>> Any other operations on it (including ISTYPE and TYPECASE, at least
>>>> when the object is a small integer) would result in a checked  
>>> runtime
>>>> error.
>>>>
>>>> Note that with the declaration "RefanyOrInt.T = REFANY", the current
>>>> compiler will as far as I know not accept any operations on
>>>> RefanyOrInt.T outside of ISTYPE, TYPECASE, and NARROW (explicit or
>>>> implicit).
>>>>
>>>> I wouldn't be surprised if most of what I'm proposing already works
>>>> (i.e., crashes with a checked runtime error as it should) with the
>>>> current runtime.  Anything that slips through would need to be fixed
>>>> up with a one-bit check of the LSB of the REFANY for the operations
>>>> mentioned above.  Unfortunately this has to be done for operations  
>>>> on
>>>> every REFANY, not just the new type.
>>>>
>>>> I think that Modula-3 programmers are already aware that using
>>>> REFANYs involves forcing the runtime to do extra type-checking work,
>>>> so they already know not to use them if they are looking for maximum
>>>> performance, so I don't think that burdening operations on REFANY
>>>> with an extra one-bit check is too onerous.
>>>>
>>>> An advantage of my proposal is that the amount of code in the new
>>>> proposed library is truly diminutive.  In fact, I think I posted
>>>> pretty much that code to the list a few weeks ago.
>>>>
>>>> (If you missed it, it's
>>>>
>>>> http://www.async.caltech.edu/~mika/interp/
>>>>
>>>> )
>>>>
>>>>    Mika
>>>>
>>>>
>>>> "Rodney M. Bates" writes:
>>>>> I spent quite a bit of time playing with a more general system  
>>>>> where
>>>>> there is a set of "tagged" types, with (an implementation-defined
>>>>> subrange of) INTEGER and a reference type both being a subtype of a
>>>>> tagged type.  It might have been tolerable, if it weren't for the
>>>>> interaction with object subtypes and opaque types, which quickly
>>>>> gets deep into a deep linguistic tar pit.
>>>>>
>>>>> Tony's approach is much simpler and would serve what I see as the
>>>>> need.  It does sacrifice any static checking of what reference type
>>>>> is being tagged, and will also require an extra runtime ISTYPE/
>>>>> TYPECASE.
>>>>>
>>>>> Would INTEGER and REFANY be assignable to TAGGED, or would there
>>>>> also need to be an explicit conversion in that direction too, say
>>>>> VAL(x, TAGGED)?  I think I favor the explicit conversion here.  It
>>>>> is a bit inconsistent with the usual Modula-3 assignability
>>>>> philosophy,
>>>>> but not requiring the explicit conversion already gets your toes
>>>>> into tar.
>>>>>
>>>>> We would have to have something more like ISTYPE, though, which  
>>>>> will
>>>>> tell which type the value belongs to without risking a runtime  
>>>>> error,
>>>>> which VAL would do.
>>>>>
>>>>> An intermediate approach might be to have a set of tagged types
>>>>> constructed by, say, TAGGED T, where I is a reference type, but
>>>>> with no subtype relations on them at all, thus still requiring
>>>>> the explicit conversions and checks.
>>>>>
>>>>> I do want to say, I _really_ want this capability somehow.  I have
>>>>> an ADT module whose internal data structure, for full generality,
>>>>> needs to have two heap objects (the second because it has nonstatic
>>>>> size) and 11 total words counting the orginal pointer, in the  
>>>>> case of
>>>>> values that would fit directly  into the "pointer" word.  Moreover,
>>>>> these small cases are in the great majority.
>>>>>
>>>>> Besides the 11-to-one increase in actual occupied space, there
>>>>> is extra time for allocation, periodic tracing, and collection,  
>>>>> space
>>>>> loss due to heap fragmentation, and time/space tradeoff loss due to
>>>>> reduced locality of reference.  So sometimes, it would be a big
>>>>> performance gain if we could do it.
>>>>
>>>>>
>>>>> Tony Hosking wrote:
>>>>> It is a much more pervasive hack than I would be comfortable with
>>>>>> since it touches on the compiler (for all the type operations) as
>>>>>> well
>>>>>> as the run-time system in ways that are pretty gross.  I would  
>>>>>> much
>>>>>> rather have a new TAGGED builtin.  ISTYPE would not work on it  
>>>>>> since
>>>>>> that only works on references.  The only thing you need is a way  
>>>>>> to
>>>>>> extract the value -- we could overload VAL(x, T) where x can be a
>>>>>> TAGGED and T can be REFANY or INTEGER.
>>>>>>
>>>>>>



More information about the M3devel mailing list