[M3devel] long vs. INTEGER? ranges vs. word/integer?

Jay jay.krell at cornell.edu
Sat Jan 3 13:14:02 CET 2009


Sorry, big mistakes, below:
 
NOT sizeof(short) >= 2
NOT sizeof(long) >= 4
but rather
sizeof(short) * CHAR_BIT >= 16sizeof(long) * CHAR_BIT >= 32
 
A conforming implementation, as I understand, could have:
#define CHAR_BIT 32
sizeof(char) = sizeof(short) = sizeof(int) = sizeof(long) = 1
 
Though /practically/ speaking, CHAR_BIT is always 8 and what I said is true.
 
You know..what is it..the Matlab model? Where everything is a 64 bit
double with >= 32 bit mantissa, therefore everything could be a double?
 
 - Jay



From: jay.krell at cornell.eduTo: eiserlohpp at yahoo.com; m3devel at elegosoft.comDate: Sat, 3 Jan 2009 08:04:37 +0000Subject: Re: [M3devel] long vs. INTEGER? ranges vs. word/integer?

Oh I should have read more closely to correct more..  > In "C" a> long is specified by the language spec to be an integral > type big enough to hold a pointer. Longs in "C" may be> bigger than a pointer, but they must be at least that size.  No. This is very false.There is (are) the standards, and there is the practical reality,and neither agree with your assertion. In "C89", ANSI C prior to C9x, the integral types are: char -- of wierd/uncertain signedness unsigned char  signed char  short (aka short int aka signed short ? aka signed short int)  unsigned short (aka unsigned short int)  int  (aka signed int)  unsigned (aka unsigned int)  long (aka long int aka signed long int)  unsigned long (aka unsigned long int)  limits.h #defines CHAR_BIT, the number of bits in a char (or signed char or unsigned char)  CHAR_BIT must be at least 8.Practically speaking it is always exactly 8, but..perhaps not on a Cray. sizeof(char) == 1, by definition  if char is more than 8 bits, then CHAR_BIT must be adjusted, not sizeof(char) sizeof(short) >= 2 sizeof(int) >= sizeof(short)sizeof(long) >= sizeof(int)sizeof(long) >= 4 There are some types size_t and ptrdiff_t that I'm not sure what the standard says.Something like, size_t can hold the index for any array.However, the standard doesn't like speak of address spaces and their sizes, so just what is the maximum index for an array, I don't know. Practically speaking, size_t and ptrdiff_t are the exact same size as a pointer.Practially speaking, all pointers are of the same size. Though that isn't clear in the standard as I understand.Practically speaking, all pointers roundtrip through any pointer type, as well as size_t or ptrdiff_t. There is no relationship between long and pointers, in the standard. Practically speaking:CHAR_BIT == 8sizeof(short) == 2sizeof(int) == 4sizeof(long) == 4 in all 32bit and 64bit (and 16bit) Windows programming environmentssizeof(long) == sizeof(void*) in most non-Windows programming environments C9x introduces several new types and typedefs.As I understand, many of the typedefs are in stdint.h, and many are optional. long long I presume is speced as:sizeof(long long) >= sizeof(long)sizeof(long long) >= 8 Practically speaking, sizeof(long long) == 8. stdint.h goes nuts with the obvious options.It defines things like:  fast_intN_t -- signed type that is "fast" and at least N bits in size    fast_uintN_t -- unsigned ditto    least_intN_t -- smallest integer that is at leasrt N bits in size   least_uintN_t -- unsigned ditto    intN_t -- signed integer exactly N bits in size   uintN_t -- usigned ditto     intptr_t -- signed integer exactly the size of a pointer   uintptr_t -- unsigned ditto  Not every type is mandatory though, and "fast" is vague. Then, they go even further, they have #defines for the printf and scanf strings for each type..I think. At least they have a bunch of #defines to abstract printf/scanf, not sure they have every single one.Boom goes the combinatorial explosion.  So, personally, I think this is overkill. I think you should have just the exact sized types, plus the pointer-sized types.Throw out the "least" and "fast" types. Printf/scanf is thornier.It does seem a real problem, with no pretty solution. When I am just printing for debugging purposes, I often just hardcode %u or %lu and cast my data. I usually don't care if I lose the upper 32 bits, if they are even there, when I am just debugging. Microsoft long ago introduced %I64u and %I64d, but everyone else use something else, I guess %llu and %lld for long long, or maybe %Lu and %Ld. "long long" and %ll and "least"/"fast" are all nice and abstract, but again, I think it is all overkill in terms of abstraction. As you said, persistance formats are a large concern.Persistance formats frequently guide in-memory representations.Furthermore, I theorize that the abstractions aren't all that useful for in-memory data either.If you want your capacity to grow as address space grows, use size_t.Otherwise, pick a "reasonable" type.uint32_t is almost always reasonably efficent and offers adequate capacity.If you aren't sure, well go with uint64_t, it is still /fairly/ efficient on current 32 bit systems, offers nearly infinite capacity, and is perfectly efficient on 64bit systems..and 32bit should be trending downward..maybe.64bit is still memory-inefficient, so if you have large structs/arrays and certainly are comfy with fewer bits, use small. So..the decision isn't always trivial, but I still suspect using exactly sized types is plenty adequate, and that worrying about the other types is too much. gcc internally uses pairs of longs to portably represent 64 bits.And they also have "wide" integers -- the hosts widest integral type.I forget what they use them for, but I'm sure they are compatible with a 32 bit "wide" -- gcc is after all portable to a C compiler without long long, but probably not to a compiler with 16 bit int (unless maybe they use long everywhere in that situation? I don't know.)  - Jay 



From: jay.krell at cornell.eduTo: eiserlohpp at yahoo.com; m3devel at elegosoft.comSubject: RE: [M3devel] long vs. INTEGER? ranges vs. word/integer?Date: Sat, 3 Jan 2009 07:45:26 +0000

Peter, Huh? Of course not. "long long" is always exactly 64 bits, except on systems where it doesn't exist at all. There isn't really any 128 bit integer type in widespread use, aside from multi precision arithmetic libraries and such. long on Windows is always 32bits, as I said. The pointer sized types are: size_t, ptrdiff_t, ssize_t, INTEGER, and any pointer Surely sizeof(long) == sizeof(void*) was considered for Win64, but rejected due to the wider spread source incompatibility it would have caused. Yes, I know, every other 64 bit system took a different route -- Alpha, AIX, Solaris, IRIX, Darwin -- but they all probably had far less code to deal with. I have also programmed with a 16bit int, but nobody seems to care about portability to that these days/decades/centuries. If folks here want to imagine Modula-3 is portable to such a system, with 16 bit INTEGER, I am willing to entertain discussion and maybe even code as to the ramifications..maybe..  - Jay> Date: Fri, 2 Jan 2009 22:52:26 -0800> From: eiserlohpp@> To: m3devel@> Subject: Re: [M3devel] long vs. INTEGER? ranges vs. word/integer?> > Jay,> > Please don't make size_t 128 bits in size. In "C" a> long is specified by the language spec to be an integral > type big enough to hold a pointer. Longs in "C" may be> bigger than a pointer, but they must be at least that size. > > NOTE: an int (integer) does not have that guarantee.> > In fact, I once programed on a platform (Amiga) that had> two different compilers one used 16-bits and other 32 for> their integers. You can imagine the difficulties.> > Windows machines, as you have been using, give an address> space of 32 bits to user space. This is regardless of how> much RAM it may actually have. I would expect that on a> Win64 platform, pointers are 64 bits, and "long" is also > 64 bits.> > On my AMD64_LINUX box, pointers are 64 bits, and therefore> a long is 64, and a long long is 128. The system type> size_t is 64 bits. If you attempt to map Cstddef.size_t> to 128 bits it will not only break the syscall interfaces> to the kernel, but also be a waste of space.> > From that standpoint the type definitions> size_t = Ctypes.unsigned_long;> ..., etc.> > are correct.> > Please don't define > size_t = Ctypes.unsigned_long_long;> > > The "C" include files> /usr/includes/bits/types.h> /usr/includes/bits/typesizes.h> > play preprocessor games to ensure that the ssize_t are> defined to use the "natural" word size. > > Other types get defined to fixed sizes (ie, _int32 must> always be 32 bits) otherwise external interfaces (file> formats, inodes, ..., etc) would be corrupted. For that> reason explicitly sized types need to be defined.> > Actually, GNU/Modula-2 recently defined explictly sized> types.> > > Peter Eiserloh.> > > > Date: Fri, 2 Jan 2009 23:05:24 +0000> From: Jay <jay.krell at cornell.edu>> Subject: [M3devel] long vs. INTEGER? ranges vs. word/integer?> To: m3devel <m3devel at elegosoft.com>, Tony <hosking at cs.purdue.edu>> Message-ID: <COL101-W1612F9835414ED341527DBE6E20 at phx.gbl>> Content-Type: text/plain; charset="iso-8859-1"> > > I'd like to avoid using "long" and "ulong" anywhere.> On Unix, they are always pointer sized.> On Windows, they are always 32 bits.> > This divergence of meaning I think it renders it useless.> > I believe for pointer-sized integers, the right types are any of:> unsigned: size_t, Word.T> signed: INTEGER, ssize_t, ptrdiff_t> For 32bit integers: int32_t and uint32_t, perhaps int.> > There is arguably some ambiguity if you consider 16bit platforms.> > Now, I noticed we have:> INTERFACE Cstddef;> > size_t = Ctypes.unsigned_long; ssize_t = Ctypes.long; ptrdiff_t = Ctypes.long;> > I would like to change this, either to:> > 32bits:> size_t = Ctypes.unsigned_int; ssize_t = Ctypes.int; ptrdiff_t = Ctypes.int;> > 64bits:> size_t = Ctypes.unsigned_long_long; ssize_t = Ctypes.long_long; ptrdiff_t = Ctypes.long_long;> > or portable:> size_t = Word.T; ssize_t = INTEGER; ptrdiff_t = INTEGER;> but, my question then is, why isn't the portable version already in use?> Especially for the signed types.> > I mean, you know, we have:> > 32bits/BasicCtypes:> > INTERFACE BasicCtypes;> IMPORT Word, Long;> TYPE (* the four signed integer types *) signed_char = [-16_7f-1 .. 16_7f]; short_int = [-16_7fff-1 .. 16_7fff]; int = [-16_7fffffff-1 .. 16_7fffffff]; long_int = [-16_7fffffff-1 .. 16_7fffffff];> question is, why aren't int and long_int INTEGER?> > 64bits/BasicCtypes:> > long_int = [-16_7fffffffffffffff -1 .. 16_7fffffffffffffff ]; long_long = [-16_7fffffffffffffffL-1L .. 16_7fffffffffffffffL];> > why not INTEGER?> > > > +--------------------------------------------------------+> | Peter P. Eiserloh |> +--------------------------------------------------------+> > > 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20090103/6b2ab077/attachment-0002.html>


More information about the M3devel mailing list