[M3devel] codegen error (from Mika, new test p250)

Tue Jan 11 21:21:23 CET 2011

Sigh.

OK, for now I will fix the compiler to track the widths of values on the stack.

On Jan 11, 2011, at 12:53 PM, Rodney M. Bates wrote:

> 
> 
> Tony Hosking wrote:
>> I know what the problem is.  The fix is not particularly pretty, and will entail tracking the stack types for integers (Int32 or Int64) throughout code generation.
>> This all leads me to wonder why we don't simply back LONGINT out of the language.
>> [I had mentioned my increasing unease with LONGINT in a prior e-mail a long time ago.]
>> We can replace LONGINT with Longint and Longword:
>> Longint.T = Longword.T = BITS 64 FOR ARRAY [0..1] OF [16_00000000..16_FFFFFFFF]
> 
> Hmm, this is tricky.  I think the BITS 64 FOR is not what we would want.  First, it has
> no effect at all except when a Longint.T is a field of a record or object or an element
> of an array.  Second, even in those cases, it would force the compiler _not_ to put any
> alignment padding ahead of the Longint.T field.  The compiler could only choose between
> letting it be misaligned and generating code that would work on it that way, or, more
> likely, refusing to compile it.  It does not force 64-bit alignment.  This is all by
> existing rules of the language.
> 
> Another thought would be:
> 
>   ARRAY [0..1] OF BITS 32 FOR [16_00000000..16_FFFFFFFF]
> 
> This would lead to alignment within the array being as wanted, on both 32- and 64-bit
> machines.  But as for the alignment of the entire array,  it would not force anything.
> The alignment of an array type is naturally the alignment of its element type, but a
> Modula-3 BITS type has no alignment restriction at all, otherwise it could not be used
> as intended to allow programmer-controlled memory layout.
> 
> There is another problem here.  It stems from the fact that
> 
> <flames>
> So-called "little endian" is an inconsistent system.  It's only partly little-endian.
> To be consistently little-endian, it would have to read/write i/o streams into/from
> decreasing memory addresses and fetch instruction streams from decreasing addresses.
> It would then naturally result in the successively declared fields of records and
> elements of arrays (of increasing subscripts) being stored in decreasing addresses.
> </flames>
> 
> So as it is, for either of these array types, we have:
> 
> MSB                                LSB
> 0    1    2    3    4    5    6    7      <- big endian byte numbers in memory
> 7    6    5    4    3    2    1    0      <- hypothetical true little-endian
> 3    2    1    0    7    6    5    4      <- actual "little-endian"
> 
> Actual little-endian numbers right-to-left only within each 32-bit piece, but
> left-to-right for the elements of the array.  If this were a single scalar,
> instead of an array, the actual little-endian byte numbering would be the
> same as the middle line above.
> 
> This means the array type, on a little-endian machine could not be passed to a
> normally-represented scalar formal parameter in any language.  We could have a
> convention that the 32-bit words in the array were least significant in element
> zero, but that would just move the problem over to the big-endian machines.
> We could just require explicit conversion functions to be coded, but what
> type would they convert to?
> 
> Note that we have to keep the semantics of BITS n FOR and [lb .. ub] consistent
> with the existing language, because these type constructors will be used for other
> purposes than just constructing Longint.T and Longword.T and surly are in lots of
> preexisting code.
> 
> I don't see any way to both preserve language semantics and construct a longer
> integer type with decent properties using only preexisting types.  I think we
> would have to say something like:
> 
> "The types Longint.T and Longword.T are _just like_ <some array type>"
> (but not equal to <some array type>, so they could have unique rules).
> 
> This would parallel the existing definition of CARDINAL as
> 
> "just like [0 .. LAST(INTEGER)]"
> (but it's nevertheless a distinct type, so it pickles can do size adjustments
> on it, they way they do with INTEGER.)
> 
> But once we resort to that and defining operators on it, I doubt it could be
> any cleaner or simpler than LONGINT.  And I doubt it would simplify the subject
> compilation problem either.
> 
> 
>> and define signed operations in Longint and unsigned operations in Longword.
>> These can be implemented efficiently as wrappers to appropriate C routines operating on "long long" or inlined if performance is a particular concern.  We can provide conversion routines to/from INTEGER as needs.
>> Other than handling 64-bit file offsets, etc., does anyone really make use of LONGINT that argues convincingly for it to be retained?