[M3devel] m3 backends and builtin types?

Rodney M. Bates rodney_bates at lcwb.coop
Thu Aug 16 20:16:35 CEST 2012



On 08/15/2012 09:07 PM, Jay K wrote:
>   > There is a bootstrap issue here isn’t there?
>
> I am not sure. I don't understand the system enough. :(
>
>
> I suspect, but am not certain, the ultimate source of this information is/can-be m3front/builtinTypes.
> The compiler does for example construct the type BOOLEAN out of lower level pieces.
> Can it not then proceed to compute its typeid, and inform the backend about it?
>
>
> Other examples appear to include...wait.where did I get this list?
> At some point, I was verifying that gcc/gcc/m3cg/parse.c was "informed" about every typeid that it later recieved a use of.
> Like, declare_array/declare_open_array/declare_enum/declare_typename should all precede declare_local, declare_param, declare_procedure (the return type).
> I built up the following list:

This is from memory, is incomplete, and my not be 100% reliable, but
hopefully useful.

There are two different hash codes on type structures used in (at
least) the compiler, runtime, and pickles.  The "uid" is 32-bit, and
the "signature" is 64-bit.  I think the signature maybe only applies
to reference types.  It is what is used in pickles.  One of them takes
into account some properties the other does not.  I am not entirely
sure which of the following information applies to which hash code(s).

At one time, I think I had found a main program that computed one or
the other of these, or both.  They are not arbitrary, they are somehow
derived from the actual type structures.  (What are the type
structures for primitive builtin types?)  For builtin types, they have
apparently been hand copied to several places.  (compiler, RTS,
pickles, m3gdb, ...?)  They also get computed for programmer-defined
types too, but obviously you won't find these hard-coded.

However, some of the values for builtin types have suffered some
alterations.  Here is an excerpt from m3gdb, m3-lang.c, with a
possibly relevant comment that I once inferred somehow:

   if (m3uid_to_num (uid, &num_uid)) {
     if      (num_uid == 0x195c2a74) { return builtin_type_m3_integer; }
     else if (num_uid == 0x05562176) { return builtin_type_m3_longint; }
     else if (num_uid == 0x50f86574) { return builtin_type_m3_text; }
     else if (num_uid == 0x97e237e2) { return builtin_type_m3_cardinal; }
     else if (num_uid == 0x1e59237d) { return builtin_type_m3_boolean; }
     else if (num_uid == 0x08402063) { return builtin_type_m3_address; }
     else if (num_uid == 0x9d8fb489) { return builtin_type_m3_root; }
     else if (num_uid == 0xa973d3a6) { return builtin_type_m3_transient_root; }
     else if (num_uid == 0x56e16863) { return builtin_type_m3_char; }
     /* For widechar, the num_uid was once 0xb0830411.  Presumably, this is an
        outdated leftover from a transitional implementation of widechar, at
        Critical Mass, before the final one came from them. */
     else if (num_uid == 0x88f439fc) { return builtin_type_m3_widechar; }
     else if (num_uid == 0x48e16572) { return builtin_type_m3_real; }
     else if (num_uid == 0x94fe32f6) { return builtin_type_m3_longreal; }
     else if (num_uid == 0x9ee024e3) { return builtin_type_m3_extended; }
     else if (num_uid == 0x48ec756e) { return builtin_type_m3_null; }
     else if (num_uid == 0x1c1c45e6) { return builtin_type_m3_refany; }
     else if (num_uid == 0x51e4b739) { return builtin_type_m3_transient_refany; }
     else if (num_uid == 0x898ea789) { return builtin_type_m3_untraced_root; }
     else if (num_uid == 0x00000000) { return builtin_type_m3_void; }
   }

Also, there is some endian-related confusion, at least for signatures.
Sometimes they are stored in an array of bytes, whose order may be
always the same, regardless of host or target.  However, some were
apparently hand-reordered between pm3 and cm3, and the reorderings are
neither strictly endianness nor consistent for different types.  I
seem to recall something like swapping 32-bit segments, but not bytes
within them, or the other way around, or something, for some but not
all types.  I had to deal with both versions of the orderings in
pickles and maybe in m3gdb.  Here is an excerpt from Pickle2.m3:

(* Adaptation to some builtin fingerprints that have different byte order in
    Pm3 and Cm3.
*)

TYPE FPA =  ARRAY [0..7] OF BITS 8 FOR [0..255];
      (* Give a short name to anonymous type of Fingerprint.T.byte. *)

CONST NULL_uid = 16_48ec756e;
CONST pm3_NULL_Fp = FPA {16_24,16_80,16_00,16_00,16_6c,16_6c,16_75,16_6e};
CONST cm3_NULL_Fp = FPA {16_6e,16_75,16_6c,16_6c,16_00,16_00,16_80,16_24};

CONST ROOT_uid = 16_9d8fb489;
CONST pm3_ROOT_Fp = FPA {16_f8,16_09,16_19,16_c8,16_65,16_86,16_ad,16_41};
CONST cm3_ROOT_Fp = FPA {16_41,16_ad,16_86,16_65,16_c8,16_19,16_09,16_f8};

CONST UNTRACED_ROOT_uid = 16_898ea789;
CONST pm3_UNTRACED_ROOT_Fp = FPA {16_f8,16_09,16_19,16_c8,16_71,16_87,16_be,16_41};
CONST cm3_UNTRACED_ROOT_Fp = FPA {16_41,16_be,16_87,16_71,16_c8,16_19,16_09,16_f8};

(* Can the following two occur in a pickle?  Maybe if somebody registered a
    special for them? *)
CONST ADDRESS_uid = 16_08402063;
CONST pm3_ADDRESS_Fp = FPA {16_91,16_21,16_8a,16_62,16_f2,16_01,16_ca,16_6a};
CONST cm3_ADDRESS_Fp = FPA {16_f2,16_01,16_ca,16_6a,16_91,16_21,16_8a,16_62};

CONST REFANY_uid = 16_1c1c45e6;
CONST pm3_REFANY_Fp = FPA {16_65,16_72,16_24,16_80,16_79,16_6e,16_61,16_66};
CONST cm3_REFANY_Fp = FPA {16_66,16_61,16_6e,16_79,16_80,16_24,16_72,16_65};

Hope this is a step forward in the way of helpful information


>
>
> UID_INTEGER = 16_195C2A74; (* INTEGER *)
> UID_LONGINT = 16_05562176; (* LONGINT *)
> UID_WORD = 16_97E237E2; (* CARDINAL *)
> UID_LONGWORD = 16_9CED36E7; (* LONGCARD *)
> UID_REEL = 16_48E16572; (* REAL *)
> UID_LREEL = 16_94FE32F6; (* LONGREAL *)
> UID_XREEL = 16_9EE024E3; (* EXTENDED *)
> UID_BOOLEAN = 16_1E59237D; (* BOOLEAN [0..1] *)
> UID_CHAR = 16_56E16863; (* CHAR [0..255] *)
> UID_WIDECHAR = 16_88F439FC;
> UID_MUTEX = 16_1541F475; (* MUTEX *)
> UID_TEXT = 16_50F86574; (* TEXT *)
> UID_UNTRACED_ROOT = 16_898EA789; (* UNTRACED ROOT *)
> UID_ROOT = 16_9D8FB489; (* ROOT *)
> UID_REFANY = 16_1C1C45E6; (* REFANY *)
> UID_ADDR = 16_08402063; (* ADDRESS *)
> UID_RANGE_0_31 = 16_2DA6581D; (* [0..31] *)
> UID_RANGE_0_63 = 16_2FA3581D; (* [0..63] *)
> UID_PROC1 = 16_9C9DE465; (* PROCEDURE (x, y: INTEGER): INTEGER *)
> UID_PROC2 = 16_20AD399F; (* PROCEDURE (x, y: INTEGER): BOOLEAN *)
> UID_PROC3 = 16_3CE4D13B; (* PROCEDURE (x: INTEGER): INTEGER *)
> UID_PROC4 = 16_FA03E372; (* PROCEDURE (x, n: INTEGER): INTEGER *)
> UID_PROC5 = 16_509E4C68; (* PROCEDURE (x: INTEGER;  n: [0..31]): INTEGER *)
> UID_PROC6 = 16_DC1B3625; (* PROCEDURE (x: INTEGER;  n: [0..63]): INTEGER *)
> UID_PROC7 = 16_EE17DF2C; (* PROCEDURE (x: INTEGER;  i, n: CARDINAL): INTEGER *)
> UID_PROC8 = 16_B740EFD0; (* PROCEDURE (x, y: INTEGER;  i, n: CARDINAL): INTEGER *)
> UID_NULL = 16_48EC756E; (* NULL *)
>
>
> Surely these aren't all fundamental??
>
>
> Eventually I decided that declare_array/typename/open_array/enum don't come in a nice order anyway, and I changed parse.c to loop until it could "resolve" all typeids -- loop such that declare_<type> preceeds declare_local/param/field.
>
>
> That still bugs me, but I haven't verified just what the front end is doing, and if there aren't ultimately possibly circularities that make a sort impossible anyway...consider:
>
>
> TYPE Record1 = RECORD REF Record2 END;
> TYPE Record2 = RECORD REF Record1 END;
>
> If there is only declare_record followed by declare_field, it will "never work" the way I wanted -- you'll always have to see a ref to a type before seeing the type defined. Ok. So I have to deal with an arbitrary order.. I guess I'll go and write the required loop -- and hold everything in memory... at least for a later pass at this stuff. The initial version doesn't need great type info, just as the existing backends don't need it...eh..I guess I can do without fixing the original issue first therefore..
>
>
>   - Jay
>
>

 --
> From: hosking at cs.purdue.edu
> Date: Wed, 15 Aug 2012 19:58:23 -0400
> To: jay.krell at cornell.edu
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] m3 backends and builtin types?
>
> There is a bootstrap issue here isn’t there?  You need the compiler to inform the runtime of information that the compiler needs...
>
> On Aug 15, 2012, at 6:11 PM, Jay K <jay.krell at cornell.edu <mailto:jay.krell at cornell.edu>> wrote:
>
>     Something seems off to me in the current implementation.
>     Like, I don't think the backends are ever informed of various "builtin" types, such as integer, word, char, widechar, boolean, mutex.
>     I hardcoded knowledge of them in parse.c and M3C.m3.
>     That seems wrong.
>
>
>     Either that, or they are used before they are defined -- which might not be avoidable in general, but could easily be avoided for most types.
>
>
>     Shouldn't m3front inform the backend via m3cg of these types?
>     It is doable using the existing interfaces?
>
>
>     More so,RTBuiltin.mx <http://RTBuiltin.mx>ought not exist, right?
>     Whatever data it contains should be built up like any other type data?
>     Part of the same problem?
>
>
>       - Jay
>
>
>
>
> Antony Hosking|Associate Professor | Computer Science | Purdue University
> 305 N. University Street | West Lafayette | IN 47907 | USA
> Mobile+1 765 427 5484
>
>
>
>
>




More information about the M3devel mailing list