[M3devel] codegen error (from Mika, new test p250)
Tony Hosking
hosking at cs.purdue.edu
Tue Jan 11 21:21:23 CET 2011
Sigh.
OK, for now I will fix the compiler to track the widths of values on the stack.
On Jan 11, 2011, at 12:53 PM, Rodney M. Bates wrote:
>
>
> Tony Hosking wrote:
>> I know what the problem is. The fix is not particularly pretty, and will entail tracking the stack types for integers (Int32 or Int64) throughout code generation.
>> This all leads me to wonder why we don't simply back LONGINT out of the language.
>> [I had mentioned my increasing unease with LONGINT in a prior e-mail a long time ago.]
>> We can replace LONGINT with Longint and Longword:
>> Longint.T = Longword.T = BITS 64 FOR ARRAY [0..1] OF [16_00000000..16_FFFFFFFF]
>
> Hmm, this is tricky. I think the BITS 64 FOR is not what we would want. First, it has
> no effect at all except when a Longint.T is a field of a record or object or an element
> of an array. Second, even in those cases, it would force the compiler _not_ to put any
> alignment padding ahead of the Longint.T field. The compiler could only choose between
> letting it be misaligned and generating code that would work on it that way, or, more
> likely, refusing to compile it. It does not force 64-bit alignment. This is all by
> existing rules of the language.
>
> Another thought would be:
>
> ARRAY [0..1] OF BITS 32 FOR [16_00000000..16_FFFFFFFF]
>
> This would lead to alignment within the array being as wanted, on both 32- and 64-bit
> machines. But as for the alignment of the entire array, it would not force anything.
> The alignment of an array type is naturally the alignment of its element type, but a
> Modula-3 BITS type has no alignment restriction at all, otherwise it could not be used
> as intended to allow programmer-controlled memory layout.
>
> There is another problem here. It stems from the fact that
>
> <flames>
> So-called "little endian" is an inconsistent system. It's only partly little-endian.
> To be consistently little-endian, it would have to read/write i/o streams into/from
> decreasing memory addresses and fetch instruction streams from decreasing addresses.
> It would then naturally result in the successively declared fields of records and
> elements of arrays (of increasing subscripts) being stored in decreasing addresses.
> </flames>
>
> So as it is, for either of these array types, we have:
>
> MSB LSB
> 0 1 2 3 4 5 6 7 <- big endian byte numbers in memory
> 7 6 5 4 3 2 1 0 <- hypothetical true little-endian
> 3 2 1 0 7 6 5 4 <- actual "little-endian"
>
> Actual little-endian numbers right-to-left only within each 32-bit piece, but
> left-to-right for the elements of the array. If this were a single scalar,
> instead of an array, the actual little-endian byte numbering would be the
> same as the middle line above.
>
> This means the array type, on a little-endian machine could not be passed to a
> normally-represented scalar formal parameter in any language. We could have a
> convention that the 32-bit words in the array were least significant in element
> zero, but that would just move the problem over to the big-endian machines.
> We could just require explicit conversion functions to be coded, but what
> type would they convert to?
>
> Note that we have to keep the semantics of BITS n FOR and [lb .. ub] consistent
> with the existing language, because these type constructors will be used for other
> purposes than just constructing Longint.T and Longword.T and surly are in lots of
> preexisting code.
>
> I don't see any way to both preserve language semantics and construct a longer
> integer type with decent properties using only preexisting types. I think we
> would have to say something like:
>
> "The types Longint.T and Longword.T are _just like_ <some array type>"
> (but not equal to <some array type>, so they could have unique rules).
>
> This would parallel the existing definition of CARDINAL as
>
> "just like [0 .. LAST(INTEGER)]"
> (but it's nevertheless a distinct type, so it pickles can do size adjustments
> on it, they way they do with INTEGER.)
>
> But once we resort to that and defining operators on it, I doubt it could be
> any cleaner or simpler than LONGINT. And I doubt it would simplify the subject
> compilation problem either.
>
>
>> and define signed operations in Longint and unsigned operations in Longword.
>> These can be implemented efficiently as wrappers to appropriate C routines operating on "long long" or inlined if performance is a particular concern. We can provide conversion routines to/from INTEGER as needs.
>> Other than handling 64-bit file offsets, etc., does anyone really make use of LONGINT that argues convincingly for it to be retained?
More information about the M3devel
mailing list