[M3devel] Bitfields and endianness
Dragiša Durić
dragisha at m3w.org
Sun Sep 2 21:52:54 CEST 2012
Of course it is. That is why we have this pragma/language change thread.
Rodneys proposal is to clean language definition and enforce few things in language. Make them logical/right.
--
Divided by a common language
Dragiša Durić
dragisha at m3w.org
On Sep 2, 2012, at 9:32 PM, Jay K wrote:
> I'm struck by:
>
> "The values allowed for n are implementation-dependent. An illegal
> value for n is a static error. The legality of a packed type can
> depend on its context; for example, an implementation could prohibit
> packed integers from spanning word boundaries."
>
> Stuff is left up to the implementation, not the language definition.
>
>
> - Jay
>
>
> > From: dragisha at m3w.org
> > Date: Sun, 2 Sep 2012 16:45:54 +0200
> > To: rodney_bates at lcwb.coop
> > CC: m3devel at elegosoft.com
> > Subject: Re: [M3devel] Bitfields and endianness
> >
> > Rodney put it as clear as possible. We already have packing and alignment issues dictated by language definition. What we need is to make sense out of packing, fix endiannes and pack left to right. Period.
> >
> >
> > On Sep 2, 2012, at 4:41 PM, Rodney M. Bates wrote:
> >
> > > Pardon me for showing my frustration, but I think it is about
> > > time to consider what the *language* says about bit fields.
> > >
> > > The language says:
> > > -------------------------------------------------------------------------
> > > A declaration of a packed type has the form:
> > >
> > > TYPE T = BITS n FOR Base
> > >
> > > where Base is a type and n is an integer-valued constant
> > > expression. The values of type T are the same as the values of type
> > > Base, but variables of type T that occur in records, objects, or
> > > arrays will occupy exactly n bits and be packed adjacent to the
> > > preceding field or element. For example, a variable of type
> > >
> > > ARRAY [0..255] OF BITS 1 FOR BOOLEAN
> > >
> > > is an array of 256 booleans, each of which occupies one bit of storage.
> > >
> > > The values allowed for n are implementation-dependent. An illegal
> > > value for n is a static error. The legality of a packed type can
> > > depend on its context; for example, an implementation could prohibit
> > > packed integers from spanning word boundaries.
> > > -------------------------------------------------------------------------
> > >
> > > First off, the last paragraph clearly says that a compiler cannot just
> > > silently violate the layout rules given above. If it places
> > > restrictions, it has to refuse with an error message.
> > >
> > > Everyone is aware of "will occupy exactly n bits", but I have lost
> > > count of the number of times I see posts that imply the writer has
> > > missed "packed adjacent to the preceding field or element". This
> > > means there can be no padding added by the compiler, neither for
> > > alignment nor any other reason. With this rule, size, alignment, and
> > > padding can be completely controlled by the programmer.
> > >
> > > Note that there are no other rules about record/object layout, so a
> > > compiler is free to reorder them if none have a packed type. This is
> > > not actually happening in our compiler. If there is a mix, a group
> > > consisting of one non-packed field and all its immediately following
> > > packed fields would have to be kept together, but different groups
> > > could be reordered. So if you only want to avoid extra padding to
> > > save space or something, mixed packed/nonpacked fields might be
> > > useful, but for full layout control to match some external software or
> > > standard, you really would want to make them all packed.
> > >
> > > That leaves endianness, which the language says nothing about, that I
> > > can find. Apparently, the compiler(s) lay out packed fields in the
> > > endianness of the target machine.
> > >
> > > Which raises a big pet peeve of mine. Big-endian is fine, but
> > > so-called little-endian is an inconsistent system. It numbers bits
> > > and bytes right-to-left only within a field. Between fields (and
> > > array elements), it is still left-to-right. Ditto for input and
> > > output, which is always left-to-right by bytes, regardless of the size
> > > of contained fields, which i/o software and hardware would have no way
> > > of knowing about anyway. Ditto for instruction stream readout. (Ever
> > > try to figure out how to write a consistent memory dump for a
> > > little-endian machine? Mercifully, we don't much use them anymore,
> > > but there was a time.)
> > >
> > > The compiler lays out in increasing bit numbers, which get reduced
> > > later (in little-endian) to bytes via right-to-left ordering of bits
> > > within bytes and also multiple byte-fragments of a single field.
> > >
> > > The result is that you cannot in general, use one set of endian rules
> > > to duplicate the way things would be done in the other. Dragiša's
> > > original example shows this clearly. The standard he is trying to
> > > match lays things out in big-endian. In a little-endian
> > > reinterpretation of this layout, some fields have fragments that are
> > > discontiguous, as well as out of sequence.
> > >
> > > So to use a little-endian version of Modula-3's packing rules (as the
> > > compiler is doing, since it is compiling for a little-endian target),
> > > he would have to do his own bit-twiddling of the fragments to get a
> > > field in or out of the record. Which pretty well defeats the purpose
> > > of having a packed record layout. It would be more-or-less as easy,
> > > and probably a lot clearer to just treat as ARRAY OF Word.T or such
> > > and bit twiddle on that.
> > >
> > > I think the clear conclusion is that the language's system is
> > > incomplete, and to fix it, we need a way to specify the endianness
> > > used to lay out a record/object with packed fields (and arrays too)
> > > independent of that of the target machine. Whether that is a pragma
> > > or in the core of the language is a secondary question, although I
> > > prefer a true language syntax, just because pragmas, in theory, are
> > > not supposed to change the behavioral semantics of a program.
> > >
> > > We also need to specify in the language, what the actual rules are for
> > > little-endian, where it is far from obvious, due to its
> > > endian-confusedness.
> > >
> > >
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20120902/bae6ba99/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20120902/bae6ba99/attachment-0002.sig>
More information about the M3devel
mailing list