[M3devel] codegen error (from Mika, new test p250)

Jay K jay.krell at cornell.edu
Tue Jan 11 21:48:31 CET 2011


 > The point is that we want to have an opaque type
 
 
 
I find there is a frequent natural tension between opacity and ease and efficiency of implementation of the functions that do understand and manipulate the type.
 
 
 
If the type is heap allocated and only ever passed around by pointer, the tension is fairly cheap to resolve with one pointer-to-pointer cast/loophole.
 
 
 
If the type is potentially stack allocated and passed by value, the tension is less cheap to resolve -- the client still needs to know the size. And the code that does know the type inevitably will introduce an indirection in going through the opacity.
 
 
 
Fully revealing the type to clients is also not great, as it allows breaking through the abstraction boundary.
 
 
 
I can think of a few "good" examples.
Word.T for example is easily mis-used.
It would be nice if a client could not so easily use operator < on Word.T.
However, Word.T's transparency does have some good properties.
 
 
 
Another example in C++ is std::vector<T>::iterator.
std::vector<T>::iterator is nearly the exact same interface as T*, except that it isn't guaranteed to be assignable/mixable with T*. Some older implementations made it T*, allowing for such accidental mixing. Newer implementations wrap the T* in an arbitrary struct just make it a different type.
There is a bit of inconvenience in the implementation therefore, but probably no inefficiency, esp. due to heavy inlinability.
 
 
 
This is probably an area that multiple languages could use improvement.
 
 
 
Another thing to consider is ABI compatibility with C.
Can I pass LONGINT to/from C and map it to "long long" or "__int64"?
And alignment as was pointed out -- should alignment be 32 bits or 64 bits on 32bit targets?
 
 
 
Generally, 32bit is actually ok. But maybe not for atomic operations.
I'd kind of rather the alignment be 64bits, but this is a controversial area, related to how much target-dependent code we have and how we interface with C -- ie: cloning third party headers or cloning our own headers.
 
 
Can the frontend be better structured to allow for an arbitrary number of integer and floating point types? So that adding "int128" and "float256" won't be difficult?
 
 
You know -- replace series of ifs or case statements with loops over arrays?
 
 
 
 - Jay




----------------------------------------
> From: hosking at cs.purdue.edu
> Date: Tue, 11 Jan 2011 15:18:09 -0500
> To: jay.krell at cornell.edu
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] codegen error (from Mika, new test p250)
>
> No, that would be too late. I certainly don't want automatic loopholing in the code generator. I want explicit, checked types, as we currently have, to avoid the sorts of errors it has now thrown up.
>
> I do know what the problem is.
>
> On Jan 11, 2011, at 2:36 AM, Jay K wrote:
>
> >
> > ps: can M3CG_Check be pushed into service to do the work?
> > You know, it is already tracking everything and where the detection/error is currently.
> > Can we make it, like, required, and have it cast/loophole as needed?
> > Or is this too late and it might miss problems, such as incorrect/unchecked narrowing?
> >
> >
> > - Jay
> >
> >
> > ----------------------------------------
> >> From: jay.krell at cornell.edu
> >> To: hosking at cs.purdue.edu
> >> CC: m3devel at elegosoft.com
> >> Subject: RE: [M3devel] codegen error (from Mika, new test p250)
> >> Date: Tue, 11 Jan 2011 07:33:57 +0000
> >>
> >>
> >> Quibbling with the details:
> >>
> >> Is this legal, on a 32bit system?
> >>
> >> Longint.T = Longword.T = BITS 64 FOR ARRAY [0..1] OF [16_00000000..16_FFFFFFFF]
> >>
> >> And even if it is legal, it is of dubious correctness, eh?
> >> For implementation of this in Modula-3, you want the low word to be unsigned and the high word to be signed.
> >> Granted, you want full range unsigned, so, like, Word.T, but you want 32 bits.
> >>
> >> There should therefore be, like, interfaces Word32 and Word64 (ok, you already have Long).
> >>
> >> As well, the implementation on 64bit targets will perhaps suffer.
> >>
> >> As well, does it work for 64bit targets? Isn't [00000...FFFFFFFF] 64 bits?
> >> Maybe you'd say:
> >> Longint.T = Longword.T = BITS 64 FOR ARRAY [0..1] OF BITS 32 FOR [16_00000000..16_FFFFFFFF]
> >>
> >>
> >> And then, *really*, you want the order/significance of the words to be endian-specific.
> >>
> >> So, you'd want perhaps 3 implementations:
> >>
> >> 64bit: Longint.T = INTEGER
> >> 32bit big endian: Longint.T = RECORD high: INTEGER; low: Word.T; END;
> >> 32bit little endian: Longint.T = RECORD low: Word.T; high: INTEGER; END;
> >>
> >>
> >>
> >> or maybe just two:
> >>
> >> big endian: Longint.T = RECORD high: some-32bit-signed-type; low: some-32bit-psuedo-unsigned-type; END;
> >> little endian: Longint.T = RECORD low some-32bit-psuedo-unsigned-type; low: some-32bit-signed-type; END;
> >>
> >>
> >>
> >> Ultimately, the very very very general true point is:
> >> extension via library is probably generally easier
> >> BUT only makes for good results in an adequate language, e.g. one with operator overloading!
> >>
> >>
> >> Surely surely the compiler isn't so unmalleable?
> >> ie. we aren't stuck with the language asis because the compiler is too hard to change?
> >>
> >>
> >> I can't argue too strongly in favor of LONGINT.
> >>
> >>
> >> But..definitely there should be some reasonable convenient efficient way for dealing with 64bit integers.
> >> 32bit C implementations have had very good mechanisms for 25+ years.
> >> It does seem a shame we seemingly can't/won't compete.
> >>
> >>
> >> And still. interface Rd/Wr I believe still need work..
> >> Probably to add a parallel set of functions ending in "L".
> >>
> >> - Jay
> >>
> >>
> >> ----------------------------------------
> >>> From: hosking at cs.purdue.edu
> >>> Date: Tue, 11 Jan 2011 00:58:35 -0500
> >>> To: jay.krell at cornell.edu
> >>> CC: m3devel at elegosoft.com
> >>> Subject: Re: [M3devel] codegen error (from Mika, new test p250)
> >>>
> >>> I know what the problem is. The fix is not particularly pretty, and will entail tracking the stack types for integers (Int32 or Int64) throughout code generation.
> >>>
> >>> This all leads me to wonder why we don't simply back LONGINT out of the language.
> >>> [I had mentioned my increasing unease with LONGINT in a prior e-mail a long time ago.]
> >>>
> >>> We can replace LONGINT with Longint and Longword:
> >>>
> >>> Longint.T = Longword.T = BITS 64 FOR ARRAY [0..1] OF [16_00000000..16_FFFFFFFF]
> >>>
> >>> and define signed operations in Longint and unsigned operations in Longword.
> >>> These can be implemented efficiently as wrappers to appropriate C routines operating on "long long" or inlined if performance is a particular concern. We can provide conversion routines to/from INTEGER as needs.
> >>>
> >>> Other than handling 64-bit file offsets, etc., does anyone really make use of LONGINT that argues convincingly for it to be retained?
> >>>
> >>> On Jan 8, 2011, at 6:55 PM, Jay K wrote:
> >>>
> >>>>
> >>>> Thank you much. Please notice there are 2 or 3 similar problems. This shows only 1.
> >>>> See test p250.
> >>>>
> >>>>
> >>>> MODULE Main;
> >>>>
> >>>> PROCEDURE F1(<*UNUSED*>x: LONGINT) = BEGIN END F1;
> >>>>
> >>>> PROCEDURE F2() =
> >>>> <*UNUSED*>VAR x: [0L..0L];
> >>>> BEGIN
> >>>> END F2;
> >>>>
> >>>> PROCEDURE F3() =
> >>>> VAR x: [0L..0L];
> >>>> BEGIN
> >>>> F1(x);
> >>>> END F3;
> >>>>
> >>>> BEGIN
> >>>> F1(0L);
> >>>> F2();
> >>>> F3();
> >>>> END Main.
> >>>>
> >>>> i.e. initializing local 0L..0L.
> >>>> Passing 0L..0L parameter.
> >>>>
> >>>> - Jay
> >>>>
> >>>>
> >>>> ----------------------------------------
> >>>>> From: hosking at cs.purdue.edu
> >>>>> Date: Sat, 8 Jan 2011 16:59:33 -0500
> >>>>> To: jay.krell at cornell.edu
> >>>>> CC: m3devel at elegosoft.com
> >>>>> Subject: Re: [M3devel] codegen error (from Mika, new test p250)
> >>>>>
> >>>>> I'll look into this one.
> >>>>>
> >>>>> Antony Hosking | Associate Professor | Computer Science | Purdue University
> >>>>> 305 N. University Street | West Lafayette | IN 47907 | USA
> >>>>> Office +1 765 494 6001 | Mobile +1 765 427 5484
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Jan 8, 2011, at 12:17 AM, Jay K wrote:
> >>>>>
> >>>>>>
> >>>>>> fyi, small repro:
> >>>>>>
> >>>>>>
> >>>>>> MODULE Main;
> >>>>>>
> >>>>>> VAR x: [0L..0L];
> >>>>>>
> >>>>>> PROCEDURE F2(<*UNUSED*>x: LONGINT) = BEGIN END F2;
> >>>>>>
> >>>>>> BEGIN
> >>>>>> F2(x);
> >>>>>> END Main.
> >>>>>>
> >>>>>> (32) start_call_direct procedure:0x4 level:0
> >>>>>> (33) load var:0x2 offset:0x1A0(416) src_t:word_8 dst_t:int_32
> >>>>>> (34) comment comment:********* M3CG_Check ERROR *********** bad stack: expected [ Int64 ] got [ Int32 ]
> >>>>>> (35) pop_param type:int_64
> >>>>>>
> >>>>>>
> >>>>>> - Jay
> >>>>>>
> >>>>>>
> >>>>>> ________________________________
> >>>>>>> From: jay.krell at cornell.edu
> >>>>>>> To: m3devel at elegosoft.com; mika at async.caltech.edu
> >>>>>>> Subject: RE: codegen error (from Mika, new test p250)
> >>>>>>> Date: Thu, 6 Jan 2011 01:21:00 +0000
> >>>>>>>
> >>>>>>> fyi:
> >>>>>>>
> >>>>>>> jbook2:p250 jay$ rm -rf I386_DARWIN/
> >>>>>>> jbook2:p250 jay$ cm3 -keep
> >>>>>>> --- building in I386_DARWIN ---
> >>>>>>>
> >>>>>>> new source -> compiling Main.m3
> >>>>>>> "../Main.m3", line 1: 1 code generation error
> >>>>>>> 1 error encountered
> >>>>>>> compilation failed => not building program "pgm"
> >>>>>>> Fatal Error: package build failed
> >>>>>>> jbook2:p250 jay$ cm3cg -y I386_DARWIN/Main.mc 2>&1 | grep -i comment
> >>>>>>> (4) comment comment:module global constants
> >>>>>>> (6) comment comment:module global data
> >>>>>>> (27) comment comment:F1
> >>>>>>> (34) comment comment:********* M3CG_Check ERROR
> >>>>>>> *********** bad stack: expected [ Int64 ] got [ Int32 ]
> >>>>>>> (43) comment comment:F2
> >>>>>>> (73) comment comment:Main_M3
> >>>>>>> (74) comment comment:module main body Main_M3
> >>>>>>> (83) comment comment:global constant type descriptor
> >>>>>>> (85) comment comment:global data type descriptor
> >>>>>>> (87) comment comment:module global constants
> >>>>>>> (90) comment comment:procedure names
> >>>>>>> (94) comment comment:procedure table
> >>>>>>> (101) comment comment:file name
> >>>>>>> (103) comment comment:type map for _t0174bdf4
> >>>>>>> (106) comment comment:type description for _t0174bdf4
> >>>>>>> (110) comment comment:module global data
> >>>>>>> (120) comment comment:typecell for _t0174bdf4
> >>>>>>> (141) comment comment:load map
> >>>>>>> (4) comment comment:module global constants
> >>>>>>> (6) comment comment:module global data
> >>>>>>> (27) comment comment:F1
> >>>>>>> (34) comment comment:********* M3CG_Check ERROR
> >>>>>>> *********** bad stack: expected [ Int64 ] got [ Int32 ]
> >>>>>>> (43) comment comment:F2
> >>>>>>> (73) comment comment:Main_M3
> >>>>>>> (74) comment comment:module main body Main_M3
> >>>>>>> (83) comment comment:global constant type descriptor
> >>>>>>> (85) comment comment:global data type descriptor
> >>>>>>> (87) comment comment:module global constants
> >>>>>>> (90) comment comment:procedure names
> >>>>>>> (94) comment comment:procedure table
> >>>>>>> (101) comment comment:file name
> >>>>>>> (103) comment comment:type map for _t0174bdf4
> >>>>>>> (106) comment comment:type description for _t0174bdf4
> >>>>>>> (110) comment comment:module global data
> >>>>>>> (120) comment comment:typecell for _t0174bdf4
> >>>>>>> (141) comment comment:load map
> >>>>>>>
> >>>>>>>
> >>>>>>> - Jay
> >>>>>>>
> >>>>>>>
> >>>>>>>> Date: Thu, 6 Jan 2011 01:26:15 +0000
> >>>>>>>> To: m3commit at elegosoft.com
> >>>>>>>> From: jkrell at elego.de
> >>>>>>>> Subject: [M3commit] CVS Update: cm3
> >>>>>>>>
> >>>>>>>> CVSROOT: /usr/cvs
> >>>>>>>> Changes by: jkrell at birch. 11/01/06 01:26:15
> >>>>>>>>
> >>>>>>>> Modified files:
> >>>>>>>> cm3/m3-sys/m3tests/src/p2/p250/: Main.m3
> >>>>>>>>
> >>>>>>>> Log message:
> >>>>>>>> slightly simpler, same error
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> 		 	   		  


More information about the M3devel mailing list