[M3devel] M3CG
Antony Hosking
hosking at cs.purdue.edu
Thu Sep 6 19:26:24 CEST 2012
Yes, they have exactly 32 bits in size to guarantee that all targets can use them. Various tools expect this, notably m3gdb and the run-time system. Tools may manipulate them as INTEGER, but really they should be thought of as Word.T (which anyway is INTEGER on all targets). The subrange just captures the range of values they can take on if interpreted as a 2s complement signed 32-bit integer. They will likely always be manipulable as INTEGER. That will not change.
So, if you want to store them in records you can easily just store as INTEGER. If you want to avoid the range check you can declare the field to be the same subrange.
If anything changes in their representation it won’t stop you manipulating as INTEGER. If the subrange changes, which it won’t then you’ll at least get a range check failure at run time or a compiler warning.
So. Use INTEGER.
Live long and prosper.
On Sep 6, 2012, at 4:13 AM, Jay K <jay.krell at cornell.edu> wrote:
> I'm really confused but the approach and mentality here.
> I believe I understand that this is not C or C++ and that there
> are things to be gained from that.
>
>
>
> First, let me restate the "less important" questions.
> The questions I care less about.
>
>
>
> What is the interface or abstract contractual properties of TypeUID?
> It is exactly 32 bits in size?
> I doubt it, but it could be.
> At least 32 bits in size?
> Well, that is kind of unavoidable, assuming a binary computer,
> a computer that actually uses bits.
>
>
>
> No more than 32 bits in size?
>
>
>
> It can store the subrange -16_7FFFFFF - 1 .. 16_7FFFFFF?
> That seems to me to be "part of" the interface, and possibly
> the entire interface. Assuming a computing device that uses "bits",
> that does imply at least 32 bits.
> It actually conceivably does not rule out requiring 33 bits,
> as signed-magnitude requires 33 bits to represent -16_7FFFFFF - 1.
> I'm willing to believe we are not portable beyond two's complement systems.
>
>
>
> And where are these properties assumed and violating them would
> cause problems?
>
>
> parse.c says:
> /*------------------------------------------------------------- type uids ---*/
> /* Modula-3 type uids are unsigned 32-bit values. They are passed as signed
> decimal integers in the intermediate code, but converted to 6-byte, base 62
> strings of characters from here to the debugger. To avoid surprises downstream,
> these generated strings are legal C identifiers. */
>
> #define UID_SIZE 6
> #define NO_UID (0xFFFFFFFFUL)
>
>
> (Notice this does NOT agree with TypeUID -- it says they are unsigned.
> But the subrange includes negative numbers. I should check this and
> the history...)
>
>
> I'm hoping we delete parse.c within a year, but that's another matter.
> The C-generating backend is coming along nicely.
>
>
> So there is a dependency that they be representable within 6 base-62
> digits. Which I believe actually allows for a larger range, at least
> slightly.
>
>
> Ok, as I said, this isn't critical.
> I'm willing to believe it must fit in 32 bits.
> I'm skeptical that it must not include extra padding bits within a struct (which
> is what the historical definition is).
>
>
> Now, let's move on to my very very simple desire
> to store these things in records, and pass them on to
> other M3CG.T implementations.
>
>
> Hypothetical dumb code, that should be simple and work.
> Yes, I know this is dumb, but it demonstrates the point.
>
>
> TYPE T1 = RECORD text: TEXT; type: TypeUID END;
>
>
> M3CG.T cg1, cg2;
>
>
> PROCEDURE declare_whatevere(text: TEXT; type: TypeUID)
> VAR t1: T1;
> BEGIN
> t1.text := text;
> t1.type := type;
> cg1.declare_whatever(t1.text, t1.type);
> cg2.declare_whatever(t1.text, t1.type);
> END declare_whatevere;
>
>
> It should be that easy, right?
> Surely this is reasonable?
>
>
>
> TypeUID could be INTEGER, a subrange, an enum, REAL, BOOLEAN
> REFANY, UNTRACED REF of anything, it could be a RECORD
> with arbitrary fields. This would work.
> It doesn't care about packing or padding.
> It doesn't assume TypeUID is a subrange and declare the similar:
>
>
> TYPE T1 = RECORD text: TEXT; type: FIRST(TypeUID) ... LAST(TypeUID) END;
>
>
> If TypeUID changes to one of several other kinds of things,
> it continues to work and be correct.
>
>
> Granted, if TypeUID became some sort of "OBJECT", maybe with
> some sort of "copy" or "clone" METHOD, then maybe this becomes wrong.
> (This is an area where C++ is very interesting -- user defined types
> can behave like values and be copied around correctly w/o regard
> to the concrete implementation/representation; it is up to the type
> implementer to make it work if he deems it. But I digress.)
>
>
> Now I see a few unavoidable options.
>
>
> 1) TypeUID can be any size.
>
> TypeUID = [-16_7FFFFFF - 1 .. 16_7FFFFFFF];
>
>
> 2) TypeUID must fit in 32 bits, but padding can be inserted
> before/after it:
>
>
> TypeUID = [-16_7FFFFFF - 1 .. 16_7FFFFFFF];
> <* ASSERT BITSIZE(TypeUID) >= 32 *>
>
>
> That is essentially what I did, but without the assertion.
> Cstdint.int32_t is just a lazy way of stating the same thing.
> Granted, with an apparent interest in the size.
> But, again, that might be an over specification.
>
>
> 3) We are super afraid of making any changes and TypeUID
> really cannot change at all, but it is reasonble
> to allow new code like I showed.
>
>
> UnpackedTypeUID = [-16_7FFFFFF - 1 .. 16_7FFFFFFF]; (* for new code to use *)
> TypeUID = BITS 32 FOR TypeUID; (* extreme compatibility *)
>
>
> I really don't think "Int32" is the way.
> It doesn't belong here. Public in M3CG/Target.
> It is not what I should store in a record to hold a TypeUID.
>
>
> Again, imagine TypeUID is opaque/abstract to me.
> I just need to copy it around and pass it on to other M3CG implementations.
> It could chane to REAL for all I care.
> I don't want to be referencing some very concrete/transparent "Int32",
> when I'm not doing anything that depends on that implementation detail.
>
>
> I even question somewhat the subrange.
>
>
> In the future 32 bit hosts and targets will disappear.
> We will have only 64bit and 128bit.
>
>
> What should TypeUID be in that world?
>
>
> Is the point really that it be an "INTEGER" that "fits"
> on any host or target?
>
>
>
> Now, I grant... I might need to convert a TypeUID to a string.
> Here we are faced with a recurring dilemna.
> Either TypeUID is a thick/complete abtraction, and we provide
> common operations:
> INTERFACE TypeUID;
>
>
> (* declared here for performance; actually abstract;
> * use only with functions in this interface *)
> TYPE T = [-16_7FFFFFF - 1 .. 16_7FFFFFFF];
>
> PROCEDURE Compare(a,b:T): [-1..1]; (* like for qsort *)
> PROCEDURE Equal(a,b:T): BOOLEAN;
> PROCEDURE ToText(a:T): TEXT;
> PROCEDURE Copy(from:T; VAR to:T);
> PROCEDURE Init(VAR a:T);
> PROCEDURE New():T;
>
> END TypeUID;
>
> MODULE TypeUID;
>
> (* for now, just an integer *)
>
> PROCEDURE Compare(a,b:T): [-1..1] (* like for qsort *)
> BEGIN
> (* DO NOT USE SUBTRACTION HERE. *)
> IF a < b THEN RETURN -1; END;
> ELSIF a > b THEN RETURN 1; END;
> ELSE RETURN 0; END;
> END Compare;
>
> PROCEDURE Equal(a,b:T): BOOLEAN =
> BEGIN
> RETURN a = b;
> END Equal;
>
> PROCEDURE ToText(a:T): TEXT
> BEGIN
> return Fmt.Int(a);
> END ToText;
>
> PROCEDURE Copy(from:T; VAR to:T) =
> BEGIN
> to := from;
> END Copy;
>
> PROCEDURE Init(VAR a:T);
> BEGIN
> a := 0;
> END Init;
>
> PROCEDURE New():T;
> BEGIN
> RETURN 0;
> END New;
>
> END TypeUID;
>
>
> OR TypeUID is very transparent, everyone knows it is an integer
> and everyone just does integer "operations" on it.
> And it probably can never change.
> But hopefully they never get added.
>
>
> Note of course that the abstract variation isn't very good
> at protecting the abstration.
>
>
> It is probably desirable to be able to say:
>
> INTERFACE
>
> TYPE T; (* fully opaque *)
>
> MODULE
>
> REVEAL T = INTEGER;
>
>
> I understand this is hard to do efficiently.
> You either need a really good "cross-module compiler",
> or you need to heap allocate them all.
> A simple compiler that produces efficient code motivates
> revealing the type in public.
> That is unfortunate.
> For example, adding TypeUIDs is probably nonsense
> and worth prohibiting. But by publically revealing
> it is an INTEGER (essentially), extra unintended operations
> are allowed on it.
>
>
> (C++ solves... since you can have a class that is small,
> and the clients know the size, and allocate room for it,
> yet all the data members could be private. The language
> is complicated, but the goal of allowing powerful efficient user
> defined types is a good one, and achieved very well)
>
>
>
> - Jay
>
>
>
>
>
> Subject: Re: [M3devel] M3CG
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Sep 2012 22:07:24 -0400
> CC: m3devel at elegosoft.com
> To: jay.krell at cornell.edu
>
> On Sep 5, 2012, at 10:06 PM, Antony Hosking <hosking at cs.purdue.edu> wrote:
>
> Jay, I just checked in the following to M3CG.i3:
>
> TYPE
> Int32 = [-16_7fffffff-1 .. 16_7fffffff];
> TypeUID = BITS 32 FOR Int32;
>
> Feel free to use M3CG.Int32.
> At least this way, if anything changes TypeUID it will be clear that someone might be relying on TypeUID.
>
> I meant Int32.
>
>
>
> Antony Hosking | Associate Professor | Computer Science | Purdue University
> 305 N. University Street | West Lafayette | IN 47907 | USA
> Mobile +1 765 427 5484
>
>
>
>
>
> On Sep 5, 2012, at 8:56 PM, Jay <jay.krell at cornell.edu> wrote:
>
> And if TypeUID changes?
>
> I want to store a TypeUID. I want to treat it opaquely/abstractly. It can change & I'd still store it correctly & efficiently. I should not go looking at it & cloning it. That seems like sound engineering. ?
>
> - Jay (briefly/pocket-sized-computer-aka-phone)
>
> On Sep 5, 2012, at 1:41 PM, Antony Hosking <hosking at cs.purdue.edu> wrote:
>
> Jay, why don't you just us a local subrange for your fields in the records and then simply assign to/from TypeUID values?
>
> RECORD typeid: [-16_7fffffff-1 .. 16_7fffffff] END;
>
> You won't pay any penalty for range checks since the subranges are exactly the same.
>
> On Sep 5, 2012, at 2:34 PM, Jay K <jay.krell at cornell.edu> wrote:
>
> I should really insert padding manually?
> I don't know if I like that or not.
>
>
> My code is/was something like:
>
> TYPE T1 = RECORD
> text: TEXT;
> typeid: TypeUID;
> END;
>
>
> and that errors on some platforms.
>
>
> I'm not using "Multipass" right now, but it does still hold possible value.
> We can remove multipass from m3makefile for nowif that helps.
>
>
> Given a similar problem in C, I would do something like:
>
> /* TypeUID must fit in 32 bits, for some reason. */
> C_ASSERT(sizeof(TypeUID) <= sizeof(UINT32));
> Where C_ASSERT is in windows.h and looks like:
> /* compile time assert */
> #define C_ASSERT(expression) typedef char __cassert__[(expression) ? 1 : -1];
> or somesuch. The error message isn't great when it fails.
>
> People also name this "static_assert". It is popular.
>
> It doesn't prohibit it from having padding around it, but it does assert it fits in 32 bits.
>
> I recall we can do similar in Modula-3.
>
> TYPE assertTypeUIDFitsIn32Bits = ARRAY [0..ORD(BITSIZE(TypeUID) <= 32)] OF INTEGER;
> or to be more certain:
> TYPE assertTypeUIDFitsIn32Bits = ARRAY [0..-ORD(BITSIZE(TypeUID) <= 32)] OF INTEGER;
>
> I'm not sure 0..0 is illegal, but I think 0 .. 1 is.
>
> (I sure do miss macros...this is too much syntax for a compile time assert...)
>
> Can we just put <* ASSERT BITSIZE(TypeUID) <= 32 *> in the .i3 file? Or elsewhere?
>
>
> - Jay
>
> > From: hosking at cs.purdue.edu
> > Date: Wed, 5 Sep 2012 11:36:03 -0400
> > To: jay.krell at cornell.edu
> > CC: m3devel at elegosoft.com
> > Subject: Re: [M3devel] M3CG
> >
> > As I recall you were having trouble with alignment, right?
> >
> > In which case, why not pad your record type out to a reasonably aligned value? As in:
> >
> > CONST PadRange = Word.LeftShift(1, BITSIZE(ADDRESS) - BITSIZE(TypeUID)) - 1;
> > RECORD
> > t: TypeUID;
> > pad: BITS BITSIZE(ADDRESS) - 32 FOR [0..PadRange];
> > END;
> >
> >
> >
> > On Sep 5, 2012, at 11:18 AM, Antony Hosking <hosking at cs.purdue.edu> wrote:
> >
> > > Remind me again why you can't use TypeUID as is?
> > > There is nothing in the language spec that prohibits packed types in records.
> > >
> > > On Sep 5, 2012, at 10:50 AM, Jay K <jay.krell at cornell.edu> wrote:
> > >
> > >> This seems quite wrong from a simple engineering/design/factoring/abstraction point of view.
> > >>
> > >>
> > >> Imagine I have a few of these things. They are subranges or enums. That fit in a smaller size.
> > >> Wouldn't it be nice, to just use the type directly and get the space savings? Opportunistically?
> > >> What if the type is a REAL or a LONGINT?
> > >> I want the type to be opaque/abstract where that is easy and cheap and this certainly seems like an easy/cheap place to have slightly valuable opacity.
> > >> Isn't it really good, more than "just nice" to let the type change and have a lot of code work just as well?
> > >> Because they didn't copy around knowledge of what the type is?
> > >> What if the type changes?
> > >> Anyone just copying it around by exact restated typename is unaffected.
> > >> Anyone who looked at and decided to restate the definition might type might be broken.
> > >>
> > >>
> > >> And again, really -- you want to sacrifice both performance and abstraction when both are easily kept better?
> > >> That is, I'm all for range checks to keep things "safe", safety is important, but in this case you can easily preserve safety w/o adding range checks.
> > >>
> > >>
> > >> We do have enums at this layer.
> > >> Would you suggest I take the ORD of all of those and store them in INTEGERs too?
> > >>
> > >>
> > >> I am really surprised.
> > >>
> > >>
> > >> - Jay
> > >>
> > >>
> > >> From: hosking at cs.purdue.edu
> > >> Date: Wed, 5 Sep 2012 10:33:58 -0400
> > >> To: dragisha at m3w.org
> > >> CC: m3devel at elegosoft.com; jay.krell at cornell.edu
> > >> Subject: Re: [M3devel] M3CG
> > >>
> > >> Precisely.
> > >>
> > >>
> > >> On Sep 5, 2012, at 3:40 AM, Dragiša Durić <dragisha at m3w.org> wrote:
> > >>
> > >> Why holding/passing of this value in INTEGER would be a problem? As long as range checks are passed on "assignment boundaries", it is all well.
> > >>
> > >> It is how Modula-3 does things.
> > >>
> > >> --
> > >> Divided by a common language
> > >>
> > >> Dragiša Durić
> > >> dragisha at m3w.org
> > >>
> > >>
> > >>
> > >>
> > >> On Sep 4, 2012, at 10:47 PM, Jay K wrote:
> > >>
> > >> RECORD HoldsTypeUID:
> > >> typeuid: [FIRST(TypeUID)..LAST(TypeUID)];
> > >> END?
> > >>
> > >>
> > >> But what if I'm really just a holder/passer of this thing,
> > >> and I never interpret it. Now TypeUID can't be changed
> > >> to LONGREAL or somesuch. Ideally some code wouldn't care
> > >> and still be correct.
> > >>
> > >>
> > >> The idea is to hold the thing, pass it on, without knowing
> > >> what it is. I want a field with the same type.
> > >>
> > >>
> > >> Why does it matter if TypeUID is representable in 32 bits?
> > >> Isn't the range the interface?
> > >> If it wasn't for readability of all the F's, I think
> > >> TypeUID = [-16_7FFFFFFF - 1 .. 16_7FFFFFFF]; is best.
> > >>
> > >>
> > >> Do we really care about few or many bits that occupies?
> > >>
> > >>
> > >> Cstdint.int32_t I agree is a bit lazy.
> > >> Maybe something like Word.Shift(1, 31) .. Word.Not(Word.Shift(1, 31)) ?
> > >> A bit wordy but ok.
> > >> Maybe not correct. Not clear the start is negative.
> > >> Maybe needs to be more like:
> > >>
> > >> (-Word.Not(Word.Shift(1, 31))) - 1 .. Word.Not(Word.Shift(1, 31))
> > >>
> > >>
> > >> But these bit twiddlings might then confuse people.
> > >> So maybe having to count F's is better.. :(
> > >>
> > >>
> > >> You know, what about being 32bits in size would be part of an interface?
> > >> I don't think much/anything, but maybe.
> > >>
> > >>
> > >> Do we do/allow things like bit operations on it? Index into it a certain number
> > >> of bits? Take the address and assume BYTESIZE == 4?
> > >> I could see those maybe occuring. Maybe.
> > >>
> > >>
> > >> But I'm pretty sure they values are fairly abstract/opaque and the best anyone can do
> > >> is format them as strings and compare them for e.g. sorting purposes, but must
> > >> assume they are fairly sparse.
> > >>
> > >>
> > >> Btw, whenever the matter of "portability to signed-magnitude" or one's complement
> > >> comes up, I admit I get a bit confused. Two's complement is very ingrained
> > >> in my world view.
> > >>
> > >>
> > >> I agree TInt.T suffices.
> > >> Just like CARDINAL isn't needed, INTEGER suffices.
> > >> Cardinal.T just disallows negative numbers earlier, or at lower level, etc.
> > >> If we had to use TInt.T and add checks in a few places that it >= 0, ok.
> > >> It seems a little nice to put the checking in.
> > >>
> > >>
> > >> - Jay
> > >>
> > >>
> > >>
> > >>
> > >> > Subject: Re: M3CG
> > >> > From: hosking at cs.purdue.edu
> > >> > Date: Tue, 4 Sep 2012 13:05:34 -0400
> > >> > CC: m3devel at elegosoft.com
> > >> > To: jay.krell at cornell.edu
> > >> >
> > >> > On Sep 4, 2012, at 12:09 PM, Jay <jay.krell at cornell.edu> wrote:
> > >> >
> > >> > > "BITS" seems to not provide any useful value. It only makes it so you can't put the type into a portable unpacked record, which is what I was doing. I either have to pack my record somehow (counting target-dependent bits??) or lay out the fields carefully in some way that all targets allow. The problem was that some targets didn't like the layout.
> > >> >
> > >> > For M3CG.TypeUID the BITS 32 enforces that it is always representable in 32 bits. This is a fundamental invariant of the compiler and run-time system. Removing it might allow bug creep. You don't need to pack your record. You just need a field sufficiently large to hold the value of a type id. In this instance an INTEGER field would suffice, or you can use the same subrange.
> > >> >
> > >> > > Given the stated range, that you'll get enough bits to store it, what is the point in giving the size too? Isn't providing just the range and no particular size, sufficient & more abstract? Granted, I went the other way.
> > >> >
> > >> > Only if an implementation chooses to store a subrange in as few bits as necessary. Yes, CM3 does this, but it is not guaranteed by the language spec. Here the size is the critical invariant.
> > >> >
> > >> > > I don't remember but guessing:
> > >> > >
> > >> > >
> > >> > > Cardinal.T would be, like CARDINAL vs. INTEGER: only slightly subtley useful: same as TInt.T, but disallows negative numbers.
> > >> >
> > >> > But for what purpose. The target does not have CARDINAL. It only has a target integer.
> > >> >
> > >> > > Most likely I ran into places where a host integer/cardinal did not necessarily suffice, and so a target type was called for.
> > >> > >
> > >> > >
> > >> > > I know we have places that need to use target types but don't yet. Specifically for the sizes of things like records & arrays. Otherwise we have hacks where 64bit systems declare 32bit limits.
> > >> >
> > >> > Again, TInt.T suffices for all of these.
> > >> >
> > >> > >
> > >> > >
> > >> > > - Jay (briefly/pocket-sized-computer-aka-phone)
> > >> > >
> > >> > > On Sep 4, 2012, at 7:11 AM, Antony Hosking <hosking at cs.purdue.edu> wrote:
> > >> > >
> > >> > >> Jay,
> > >> > >>
> > >> > >> I've been looking over some of your changes to M3CG interfaces. You'll notice that I removed your import of Cstdlib.int into M3CG_Ops.i3. It does not belong there. The type as declared:
> > >> > >>
> > >> > >> TypeUID = BITS 32 FOR [-16_7fffffff-1 .. 16_7fffffff];
> > >> > >>
> > >> > >> is correct as is. It is a 32 bit value that encodes a particular subrange, REGARDLESS of target machine. It is improper to change that definition to rely on some C type.
> > >> > >>
> > >> > >> I also have some other questions. Why did you add the type Cardinal.T? This seems entirely unnecessary, since targets don't have a Cardinal type. The only type they have is a target integer. All other Modula-3 ordinal types should be simulated in the compiler using TInt.T. I would advise that it be removed. Also, I don't understand why you changed descriptions of the primitive types to use this Cardinal.T instead of the original CARDINAL to hold information about the bit size, alignment, and byte size. There is no situation in which a host will need to emulate the behavior of CARDINAL for these values (all can be represented using CARDINAL no matter what the host).
> > >> > >>
> > >> > >> I am concerned that these changes reflect a desire on your part to change the world to fit your specific needs (whether justified or not). The interfaces defined in M3CG were carefully designed and inherit from a long code-chain going back to the 1980's and have not see huge changes since then. I strongly advise against making changes unless you have good reason, because there are a number of tools that rely on M3CG (not just those in the public sources).
> > >> > >>
> > >> > >> I am going to do a pass to revert some of your changes, since they are causing a number of my systems to fail to build. (cf. Niagara on tinderbox).
> > >> > >>
> > >> > >> I strongly advise that you try to use a private branch when developing new functionality. We as a community can then go through a revision process to vet substantive changes before merging them into the trunk.
> > >> > >>
> > >> > >> - Tony
> > >> > >>
> > >> >
> > >
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20120906/a0607712/attachment-0002.html>
More information about the M3devel
mailing list