[M3devel] packing problem… how exactly does modula-3 pack data into records?? … "solved"

Jay K jay.krell at cornell.edu
Thu Jan 26 06:46:08 CET 2012

Let's say we have something defined by hardware, I'm making this up,
and we have a C header already for it:

  typedef struct x86_pte /* fictional page table entry */  
    unsigned valid : 1;  
    unsigned execute : 1;  
    unsigned writable : 1;  
    unsigned physical_page : 29;  
  } x86_pte;  
  and I want to interface with this hardware from Modula-3.  

  What do I do?  

  ?Find the documenation, let's say it goes like:  
     x86_pte_valid 0x80000000    
    x86_pte_executable 0x40000000  
    x86_pte_writable 0x20000000  

  ?Translate that into Modula-3.  
  Do I write:  

  x86_pte = BITS 32 FOR RECORD  
    valid: BITS 1 FOR BOOLEAN;  
    executable: BITS 1 FOR BOOLEAN;  
    writable: BITS 1 FOR BOOLEAN;  
    physical_page: BITS 29 FOR [0...16_1FFFFFFF];  

or do I write:
  x86_pte = BITS 32 FOR RECORD    
    physical_page: BITS 29 FOR [0...16_1FFFFFFF];  
    writable: BITS 1 FOR BOOLEAN;  
    executable: BITS 1 FOR BOOLEAN;  
    valid: BITS 1 FOR BOOLEAN;  

?or do I write something else (maybe physical_page isn't unpacked properly) 

 ?or do I write something unambiguous but possibly slow:  

  x86_pte_opaque = RECORD:  
    opaque: BITS 32 FOR ARRAY OF BITS 8 FOR [0..255];

  PROCEDURE Valid(pte: x86_pte):  BOOLEAN =  
    RETURN Word.And(pte.opaque[0], 16_80) # 0;  
  END Valid;  

  PROCEDURE Writable(pte: x86_pte):  BOOLEAN =  
    RETURN Word.And(pte.opaque[0], 16_40) # 0;  
  END Writable;  

  PROCEDURE Executable(pte: x86_pte):  BOOLEAN =  
    RETURN Word.And(pte.opaque[0], 16_20) # 0;  
  END Executable;  

 PROCEDURE PhysicalPage(pte: x86_pte):  CARDINAL =  
    RETURN Word.Or(Word.Or(Word.Or(Word.LeftShift(Word.And(pte.opaque[0], 16_1F), 24), Word.LeftShift(pte.opaque[1], 16)), Word.LeftShift(pte.opaque[2], 8),     pte.opaque[3]); (* approx *)   
  END PhysicalPage;  

Or maybe:

  Valid, Executable, Writable: BOOLEAN;

filled in by the above functions and then accessed directly?

  But I also want to write some sort of simulator -- i.e. I'm not necessarily running on a little endian system.  

I believe the original authors desired a "straight" translation from C headers -- the first Modula-3 version above.
This allows for easier transliteration of /usr/include into m3core/unix.

  However it does not allow for the "simulator scenario" -- where the host's C compiler is
  different than the one that interprets the C struct I showed. 
  Nor does it allow well for what you want to do -- portability across machines.  

C programmers face the same dilemna.
Know that their compiler is somehow sane and predictable and use nice/convenient bitfields,
or take no dependency on the unspecified bitfield behavior and unpack bits/bytes manually.

Where C programmers depend on bitfields, and where Modula-3 transliterates C headers,
it is profitable for Modula-3 to try to emulate the platform's C compiler's behavior.
I have gone and removed m3core's dependency on this.

Again, I believe that was the original authors' intent.

Clearly there are pluses and minuses to the various approaches.

If you look at the GNU binutils, I think they do stuff like:

struct packed_foo
  unsigned char foo[2];
  unsigned char bar[4];
  unsigned char abc[8];

struct unpacked_foo


  unsigned short foo;

  unsigned long bar;

  unsigned long long abc;


void unpack_foo(packed_foo* packed, unpacked_foo* unpacked)
  unpack_little_endian(packed->foo, sizeof(packed->foo), &unpacked->foo, sizeof(&unpacked->foo));

void unpack_little_endian(unsigned char* packed, size_t packed_size, void* unpacked, size_t unpacked_size)
  unsigned long long value = { 0 };  
  size_t i = { 0 };  
  assert(unpacked_size >= packed_size);   
  for (i = 0; < packed_size)  
    value = (value << 8) | packed[i];  
  if (unpacked_size == 1)  
     *(unsigned char*)unpacked = (unsigned char)value;    
  else if (unpacked_size == sizeof(unsigned short))   
     *(unsigned short*)unpacked = (unsigned short)value;    
  else if (unpacked_size == sizeof(unsigned long))    
     *(unsigned long*)unpacked = (unsigned long)value;    
  else if (unpacked_size == sizeof(unsigned long long))    
     *(unsigned long long*)unpacked = (unsigned long  long)value;    

This has the cost/tedium of copying, but it is very portable, while still letting the
compiler do a little of the layout for you.

If you are defining your own protocol, I think the opaque/unpacking strategy is safest,
albeit slowest and most tedious.

You could also be wasteful and not use bitfields in your protocol.

 - Jay

From: dragisha at m3w.org
Date: Fri, 20 Jan 2012 22:59:34 +0100
To: jay.krell at cornell.edu
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] packing problem… how exactly does modula-3 pack data into records?? … "solved"

Ok, it is your standpoint. I don't share it. 
Right now my problem is solved. If I encounter its variation again, I'l fix it for myself.
Excuse "C does it too" does not ring too Modulee for me :).
On Jan 20, 2012, at 7:11 PM, Jay wrote:As in C, if you need portability and predictability, just don't use bitfields. Use integer types of size 8, 16, 32, or possibly 64 bits, and do the appropriate shifting and masking, possibly endian-dependent.

Actually, to avoid endian and packing/alignment concerns, use only 8 bit integers and optionally pack/unpack into more convenient types.

- Jay (phone)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20120126/cbf38f47/attachment-0002.html>

More information about the M3devel mailing list