[M3devel] Using REFANYs to store integers?

Mika Nystrom mika at async.caltech.edu
Wed Dec 12 10:47:52 CET 2007


Tony Hosking writes:
>I didn't mean to disparage your suggestion -- just pointing out some  
>of the pitfalls.  Certainly, the current collectors will have  
>problems if you store tagged values in REFANY fields, since it  
>expects to find aligned references there.  I can see a couple of  
>options that might work, some of which might use the WeakRef  
>interface to control reachability.  I imagine we could come up with  
>some sort of API that would support this.  It might even be an  
>interesting result...
>

Oh I know that.  It's clear that even if one does use the hack I
have described, it would be criminal to release code that couldn't
survive without it.  As you point out there are many conceivable
M3 implementations that might not support such a nasty hack.

I think your idea of an API could be an interesting exercise---you
could "squirrel away" data in REFANYs that don't actually refer to
anything but are themselves values (ANYs?).  Of course there's not
much point (is there?) unless the squirreling and un-squirreling
is more efficient than an extra memory allocation and dereference.
I take it that the API could also tell the hacker when the hack is
unavailable, for any of the reasons you have mentioned...

     Mika 

>On Dec 10, 2007, at 6:33 PM, Mika Nystrom wrote:
>
>>
>> Well, come to think of it... I'm not sure I can even do my own GC.
>> The idea is that the interpreter I am working on would have more
>> or less unrestricted access to the M3 environment, anyhow, and that
>> many of the objects on its stacks will be plain old M3 objects,
>> which will be handled as opaque handles by the interpreter.  In
>> other words, I'd have to have and worry about two garbage collectors
>> (M3 + interpreter) if I follow your suggestion.  I was really hoping
>> to avoid that.  I was also hoping to use your excellent garbage
>> collector rather than having to worry about that too, in addition
>> to all the other issues that come up in coding bytecode compilers,
>> interpreters, with all the other runtime support etc etc.
>>
>> Since I clearly have to support versions of M3 that don't appreciate
>> the LSB hack I am just going to have to be able to represent small
>> integers as something along the lines of
>>
>>    TYPE SmallInteger = OBJECT v : INTEGER END
>>
>> anyhow.  This is not a problem, it's just going to be slow, since
>> I anticipate that most of the work done by the interpreter will be
>> manipulating small integers (array indices, etc.)  Tony, don't
>> worry, I'm not going to give you (much of) a hard time if you come
>> up with a beautiful garbage collector that's incompatible with my
>> "nasty hack"!
>>
>> In any case, I am just thinking aloud (awrite?) and you gentlemen
>> have fully satisfied my curiosity with your answers to my questions.
>> Thanks!
>>
>>      Mika
>>
>> Tony Hosking writes:
>>> The LSB trick is something that one could teach the current
>>> implementation to handle, but I note that there is *nothing* in the
>>> language definition that would mean this hack could ever be
>>> portable.  Moreover, there is reason to expect that a particular GC
>>> implementation might use this same trick internally, in which case
>>> you have a conflict between your hack and that implementation.
>>> Basically, you should not assume *any* particular representation for
>>> REFANY in Modula-3.  For example, an implementation would be free to
>>> use a level of indirection (ie, "handles" or an object table) to
>>> implement M3 REFANY.  In sum, I think what you propose is a nasty
>>> hack.  Better to implement your own GC anyway, since it is likely to
>>> be much better to exploit the semantics of your interpreter/language
>>> for best behavior.
>>>
>>> On Dec 10, 2007, at 3:21 PM, Mika Nystrom wrote:
>>>
>>>> Hi Rodney,
>>>>
>>>> You're reading my mind.  What I am thinking of doing is that I want
>>>> to optimize the handling of small integers (only), exactly the way
>>>> it's done in Smalltalk, that is, immutable, single-instance objects.
>>>> And only the interpreter will ever dereference the objects, so it
>>>> would check the LSBs---if they are set to zero, it will assume that
>>>> it's a heap object, if anything else, it's a special immutable value
>>>> of some sort.
>>>>
>>>> I'm well aware it's a very dirty trick to attempt!  But I was
>>>> thinking that by doing this sort of thing I could avoid having to
>>>> code my own garbage collector.  I'm not surprised that the current
>>>> garbage collector has issues with it but I also cannot think of a
>>>> reason why one couldn't make it ignore unaligned "pointers" unless
>>>> the trick is already used for some purpose I am not aware of.
>>>>
>>>>     Mika
>>>>
>>>> "Rodney M. Bates" writes:
>>>>> I presume "pointers" that are not 0 MOD 4 "refer to" some kinds
>>>>> of immutable, single-instance objects and don't actually point
>>>>> to heap objects in need of collection, as in Smalltalk?  I would
>>>>> think it would be a simple modification of the existing M3 GC to
>>>>> just make it ignore words containing misaligned pointers.
>>>>>
>>>>> Will only interpreted bytecode ever dereference these pointers?
>>>>> Or does M3 compiler-generated code also dereference them?  Then
>>>>> the compiler would also have to generate checks and do whatever
>>>>> with misaligned pointers.
>>>>>
>>>>> Mika Nystrom wrote:
>>>>>> Hello Modula-3ers...
>>>>>>
>>>>>> I have a question about garbage collecting, pointers, and dirty
>>>>>> tricks that I'm curious if anyone (Tony??) can answer.
>>>>>>
>>>>>> Here it is.  I am considering writing a bytecode interpreter that
>>>>>> is to run mixed into a Modula-3 environment, that is, it will be
>>>>>> able to deal with bytecodes as well as Modula-3 objects.  There is
>>>>>> no problem with that of course, as bytecodes are "code" and  
>>>>>> Modula-3
>>>>>> objects are "objects" and can live on separate stacks.  However,
>>>>>> as an optimization (not an uncommon one in these types of  
>>>>>> systems),
>>>>>> I'd like a space and time efficient representation for "small
>>>>>> integers".  A tried-and-true method (it goes back at least to the
>>>>>> Smalltalk-80 runtime) is to realize that pointers (in my case
>>>>>> REFANYs) always point to word-aligned addresses.  We can then use
>>>>>> integers that are congruent to 1, 2, and 3 (mod 4) to represent
>>>>>> other types of data.
>>>>>>
>>>>>> What will happen if I LOOPHOLE such integers back and forth to
>>>>>> REFANY?  Will the garbage collector just ignore them?  I wrote a
>>>>>> test program that does it and it doesn't crash... except when you
>>>>>> hit ctrl-C, it often dies with an assertion failure in
>>>>>> RTCollector.m3.. (This both with my ancient PM3 on FreeBSD and a
>>>>>> relatively recent update of CM3 on PPC_DARWIN.)  If the garbage
>>>>>> collector somehow disapproves of these integers, is there any
>>>>>> conceivable thing that would be broken by making necessary
>>>>>> adjustments
>>>>>> to the garbage collector such that it would just ignore them?
>>>>>> Or is there a better way of solving my problem?
>>>>>>
>>>>>>       Best regards,
>>>>>>         Mika
>>>>>>
>>>>>
>>>>> -- 
>>>>> -------------------------------------------------------------
>>>>> Rodney M. Bates, retired assistant professor
>>>>> Dept. of Computer Science, Wichita State University
>>>>> Wichita, KS 67260-0083
>>>>> 316-978-3922
>>>>> rodney.bates at wichita.edu



More information about the M3devel mailing list