[M3devel] the LONGINT proposal

Mon Jan 11 18:32:13 CET 2010

Quick summary:

I agree, and you seem to be supporting the status quo (other than your discomfort with ORD/VAL) as defined at: http://www.cs.purdue.edu/homes/hosking/m3/reference/

On 11 Jan 2010, at 01:11, Randy Coleburn wrote:

> Tony:
>  
> Sorry, I have been too long-winded here.  To answer your questions succinctly: 
>  
> 1.  I can relax on the requirement for overflow checking, but programmers should never count on it silently wrapping around.

Agreed.

>  
> 2.  I think checked assignability gets us onto the slippery slope (see below).  Using differently named conversion operators would lesson some of the ugliness of ORD/VAL and also prevent confusion with their intended use as enumeration/INTEGER conversions.
>  
> Read on for the long-winded version…
>  
> According to NELSON (SPwM3), ORD and VAL convert between enumerations and INTEGERs, and INTEGER is all integers represented by the implementation.  So, the range of INTEGER is likely different for 8-bit, 16-bit, 32-bit, and 64-bit processors. 
>  
> Today we see 32-bit and 64-bit processors as predominant, but I remember the day when 8-bit and 16-bit were the norm.  Someday we may see 128-bit processors as the norm. 
>  
> (I’ve been cleaning up my basement office and ran across a box of 8-inch floppy disks.  When I showed them to my daughter she understood the meaning of “floppy” as opposed to the rigid 3.5-inch floppies of today.  But, I digress.)
>  
> On a 64-bit processor, this whole idea of LONGINT as 64-bits then becomes mute since INTEGER will be 64 bits also.  But on a 16-bit machine (assuming we had an implementation for one) the native word size would be less than the 32-bits we seem to take for granted now.
>  
> One problem is that one doesn’t really know the range of LONGINT unless we define it as some number of bits.  Rodney’s proposal simply stated that LONGINT was at least as big as INTEGER but could be larger.  So, on a 64-bit machine, are LONGINT and INTEGER really the same in terms of implementation?, whereas on a 32-bit the LONGINT would have an additional 32-bits more than INTEGER?  What about a 128-bit machine?

What's wrong with using FIRST(LONGINT) and LAST(LONGINT) to determine the range?  This is currently implemented.
On a 64-bit machine the types LONGINT and INTEGER are still distinct in the current implementation, so cannot be assigned, though they do happen to have the same underlying representation.

>  I say all this to point out the obvious, namely that LONGINT and INTEGER are different types. 

Correct.  The current implementation treats them as completely separate.

>  Therefore, IMO the language must make it clear how these different types interact.
>  
> I would argue that
>    x: LONGINT := 23;
> is wrong!  The programmer should have to write
>    x: LONGINT := 23L;
> instead.

This is what we currently implement.

>  A subrange of LONGINT would be written as [23L..4200L] and would be a different type than the integer subrange [23..4200] even though the ranges are identical.

Also what we currently implement.

>  Likewise, IMO mixed arithmetic with the compiler deciding what to do is wrong.  The programmer should have to explicitly write conversions to a common type for arithmetic.

I agree, and this is the current implementation.

>  I have no problem with extending the existing operators to deal with LONGINT; it’s just that the result should be LONGINT.
> Given x: LONGINT := 49L;
>    INC(x) yields 50L
>    INC(x, 3L) yields 52L
>       note that INC(x, 3) would be a syntax error since 3 is an INTEGER and x is a LONGINT
>    (x + 20L) yields 69L
>       note that (x + 20) would be a syntax error since 20 is an INTEGER and x is a LONGINT
>    LAST(LONGINT) yields a LONGINT

This is exactly the current implementation.

>  Now that I think about it more, I have a problem using ORD/VAL for the conversion since NELSON defines these as converting between enumerations and INTEGERs, and since LONGINT is a different type than INTEGER and quite possibly has a greater range than INTEGER.  Is the proposal to also allow enumerations to use the range of LONGINT ?  Enumerations currently are defined as having a range no greater than INTEGER.  To extend them to LONGINT would lose the obvious performance benefits of keeping them same range as native INTEGER.

I'm not sure that the current implementation conflicts with the definition of ORD/VAL.  What we currently permit is ORD(LONGINT) to do a *checked* conversion of a LONGINT to an INTEGER.  The optional type parameter of VAL can be LONGINT, which permits conversion of INTEGER to LONGINT.  I don't see how these conflict with the intention of ORD/VAL.  You can see the language spec for what is currently implemented at: http://www.cs.purdue.edu/~hosking/m3/reference/.

>  Maybe we should invent new names for the conversions between INTEGER and LONGINT.  Perhaps PROMOTE and DEMOTE or some such.  These are probably bad names, but I use them below simply to illustrate (feel free to come up with better names):
>    Given longInt: LONGINT;   and   int: INTEGER;
>    int := DEMOTE(longInt); would perform the conversion from LONGINT to INTEGER and would give a runtime range check error if longInt is too big/small to fit in an INTEGER.
>    longInt := PROMOTE(int) would always succeed in performing the conversion from INTEGER to LONGINT but would make the conversion explicit
>    int + DEMOTE(longInt) would yield an INTEGER result with all arithmetic being done in the range of INTEGER
>    longInt + PROMOTE(int) would yield a LONGINT result with all arithmetic being done in the range of LONGINT

I think ORD/VAL suffice...

>  Now if we were to allow checked assignability (as Tony is leaning toward), I think we begin to get on the slippery slope.  How far do we extend this to the point that it is not clear in the expression of the code what is happening?  If I can write “int := longInt;” why not “int := 23L;” and why not “int := longInt + 57;” and is this different than “int := longInt + 57L;”? etc. etc.
>  
> I agree the ORD/VAL syntax is ugly, so that is another reason (besides them applying to enumerations only) we should use different names for the INTEGER/LONGINT conversions.
>  
> Sorry, I have been too long-winded here.  To answer your questions succinctly: 
>  
> 1.  I can relax on the requirement for overflow checking, but programmers should never count on it silently wrapping around.
>  
> 2.  I think checked assignability gets us onto the slippery slope.  Using differently named conversion operators would lesson some of the ugliness of ORD/VAL and also prevent confusion with their intended use as enumeration/INTEGER conversions.
>  
> Regards,
> Randy Coleburn
>  
> From: Tony Hosking [mailto:hosking at cs.purdue.edu] 
> Sent: Sunday, January 10, 2010 3:43 PM
> To: Randy Coleburn
> Cc: m3devel
> Subject: Re: [M3devel] the LONGINT proposal
>  
> Hi Randy,
>  
> As someone who has actually written Modula-3 programs for a living your opinions are always highly valued.  I agree with you in principle and aims, except for requiring overflow to be a checked run-time error.  The language definition already has a mechanism for handling this in the require FloatMode interface.  It is not something that the compiler should be involved in.  I also just now raised a question about perhaps having integer literals adapt their type to the context in which they are used.
>  
> I should point out that the current mainline implementation does exactly what you propose (except overflow checking).  It captures the fundamental spirit of Rodney's proposal but does not permit mixed arithmetic or assignment.  Can I ask what your issue is w.r.to checked assignability?  I am still leaning in favor.  It is not much different from assignment from an INTEGER to a subrange, which requires no explicit check, though of course there is a run-time range check.  Having programmers explicitly write:
>  
> x: INTEGER := ORD(longint, INTEGER);
>  
> seems unnecessary when they could just write
>  
> x: INTEGER := longint;
>  
> This is similar in spirit to:
>  
> x: [lo..hi] := integer;
>  
>  
> On 10 Jan 2010, at 15:00, Randy Coleburn wrote:
> 
> 
> I've been trying to follow along on this topic.
> 
> Here are my thoughts:
> 
> 1.  LONGINT should be a distinct type different from INTEGER.
> 
> 2.  There should be no mixed arithmetic between the types.  The programmer must code conversions using ORD/VAL to make explicit the intention.  Don't rely on some ill-remembered built-in conversion rule.
> 
> 3.  Overflow should be a checked run-time error, not silently wrapped around.
> 
> 4.  WRT assignability, I think explicit conversions should be used.
> 
> These statements may make me unpopular with some who don't like to type much, but I've always hated the tradeoff of understandability for brevity in expression.
> 
> The important thing is not how fast we can type up a program, it is rather how hard is it to make a mistake.  I think the spirit of Modula-3 is that the language makes you a better programmer by forcing you to make your intentions explicit rather than relying on the compiler to infer your intentions.  We need correct and maintainable software, especially at the systems level.  Whatever is decided about LONGINT, we need to keep to the original design tenants of the language.  
> 
> And yes, I do think we need a LONGINT type, not just to deal with large file sizes.
> 
> But even for long-lived readers/writers, whatever type you choose for the index will eventually be insufficient, so you have to code for the possibility that the range of the long lived reader/writer exceeds the range of your index type.  That is just good programming.
> 
> I think sometimes that the new generation of programmers has been warped by what I call the "Microsoft Mentality" where you must expect that you need to reboot/restart every so often to maintain proper performance.  Programs should be written to run forever or until their job is completed or they are commanded to stop.  
> 
> As "we" begin to converge on the design changes, I like having something concrete to look at, ala Rodney's proposal.  Can we take that and keep tweaking it in these emails until we reach a final version acceptable to all?  To me this keeps the discussion focused rather than the many different emails.  Thus, what I am trying to say is put forth a numbered proposal and each subsequent email must show adjustment to that proposal rather than just a bunch of emails discussing various aspects.  Perhaps we should vote on each proposed change, then decide to call a final vote on the whole thing.  Who should be involved in such votes?  Right now the main persons on the thread are Tony, Jay, Rodney, Mika, Hendrik, Olaf, John, and me.
> 
> My two cents.
> 
> Regards,
> Randy Coleburn
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100111/b7fdda06/attachment-0002.html>