[M3devel] unsigned integers?

Rodney M. Bates rodney.bates at wichita.edu
Tue Jun 3 04:43:20 CEST 2008


It's already available, but takes a bit of care.

Modula-2 had an INTEGER and a CARDINAL, the latter unsigned and having
full unsigned range for its word size.  I think it was not fully
defined in the original language report.  It turned out a semantic
nightmare.  It was a nightmare to code too.  Been there, done that.

Modula-3's CARDINAL, as I'm sure everybody knows is just the positive
half range of INTEGER and behaves just like a subrange of INTEGER.
This solves a lot of problems, at the cost of taking away half the
unsigned range.

For full-range unsigned, you use the type INTEGER, but use the
operations in interface Word.  It's the same type, but different
interpretations applied to the same bit pattern.  It would be a
good practice to declare things as INTEGER when they are signed
and Word.T when unsigned, but there is nothing in the language to
require it.

In the case of addition and subtraction, builtin operators "+" and
"-" will produce the same results as Word.Plus and Word.Minus,
respectively, except that the conditions under which overflow is
detected will differ.  Not that that matters much, as, AFAIK, none
of the implementations detect integer overflows anyway.

So use INTEGER and the unary and binary operators if you want it to
be signed, and use Word.T and Word.<function> if you want it unsigned.

Of course, you are free to assign between a variable that is being
treated as signed and one treated as unsigned any time, without any
overflow checks, unless you program them yourself.  This certainly
violates the intuition about what safety means.  However, on closer
look, it is not at all the same degree of unsafety as, say, arrays
going out of bounds.

My definition of safe is that nothing can happen that cannot be
explained using only the concepts of the language.  For example,
to understand what an array bounds error actually does, you would
need to know that an array actually shares memory with other variables,
code, etc., and then a huge amount about how the compiler, linker,
loader, heap allocator, etc. lay things out, along with what is
declared in all the code, including libraries you are not working on.
None of this is part of a so-called high-level language.

The INTEGER/Word.T thing can be explained knowing about binary
twos-complement representation, which is definitely a machine-level
concept.  The Word interface defines the unsigned side of it as
a language concept too, but only hints at the signed representation.
But most importantly, neither the operators nor the functions in
Word.T ever produce any values that are not meaningful in the value
set of the type.

There is no really good linguistic solution to this problem, but
I think Modula-3's is definitely the least painful I have seen.


Jay wrote:
> What is the status and chance of a 32 bit integer with the range 
> 0..0xFFFFFFFF and of a 64 bit integer type with range 0 .. 
> 0xFFFFFFFFFFFFFFFF?
>  
>  
> Already available?
> Impossible to provide?
> Only available in unsafe modules?
> Only available with restricted operations in safe modules, and more 
> operations in unsafe modules?
>  
>  
> Specifically, I think looping from 0 to N is safe -- no overflow.
> Subtracting a CARDINAL from an "UNSIGNED" is safe -- cannot overflow.
> Adding "UNSIGNED" to "UNSIGNED" is not safe -- can overflow.
> Adding or subtracting an INTEGER to "UNSIGNED" is not safe.
> Subtracting "UNSIGNED" from "UNSIGNED" is not safe -- can overflow.
> Comparing UNSIGNED to UNSIGNED for any of =, <, >, !=, is safe.
> Comparing UNSIGNED to CARDINAL or INTEGER is safe, but must be done 
> carefully.
>   Specifically, UNSIGNED > LAST(CARDINAL) is not equal to any CARDINAL 
> or UNSIGNED.
> The unsafe operations above could be runtime checked perhaps.
>   I guess that's a different larger point/dilemna -- when to allow 
> potentially unsafe operations but with runtime checks vs. the compiler 
> just disallowing them entirely. e.g. adding an integer to an integer is 
> not even safe, but checked maybe at runtime (ok, at least assignment to 
> subrange types is checked). Yes, I know I know, the runtime checks on at 
> least many integer operations is yet lacking.
>  
> Is there any, um, value in such a type?
> Is it just me blindly trying to cast Modula-3 as C (yes), but there's no 
> actual value (uncertain)?
> 
>  
> Btw, I agree there's no point in this type in representing file sizes or 
> offsets. They should be at least 63 bit integers. One bit doesn't buy 
> much for file sizes. It might be something for address spaces though?
>  
> It bugs me to define types like uintptr_t = CARDINAL or uintptr_t = 
> INTEGER. It seems quite wrong.
> Perhaps the unsigned types larger than 16 bits just should not be 
> defined in Cstdint. ??
> But there is already Ctypes.unsigned_int, unsigned_long_long, whose 
> underlying type I think is signed, but which convention says you just 
> don't do signed operations on, but which the compiler doesn't enforce, 
> right?
>  
> You know, maybe Word.T should not be INTEGER but this mythological 
> UNSIGNED/UINT??????
>  
>  - Jay

-- 
-------------------------------------------------------------
Rodney M. Bates, retired assistant professor
Dept. of Computer Science, Wichita State University
Wichita, KS 67260-0083
316-978-3922
rodney.bates at wichita.edu



More information about the M3devel mailing list