[M3devel] FW: proposal/insistence for fixed size integer types in Ctypes.i3

Jay jayk123 at hotmail.com
Mon Jun 2 12:13:41 CEST 2008


They are not from the C standard, nor are they necessarily from a particular platform.
Some platforms define some of them.
But all platforms could define all of them, at least 8/16/32, and could define them identically.
I want to add these somewhere in m3core.
They are portable enough.
 
 - Jay


CC: m3devel at elegosoft.comFrom: hosking at cs.purdue.eduTo: jayk123 at hotmail.comSubject: Re: [M3devel] FW: proposal/insistence for fixed size integer types in Ctypes.i3Date: Mon, 2 Jun 2008 10:23:44 +0100


Are these types defined by the C standard.  If not then they don't belong in Ctypes.  If they are only defined by their particular platform then they do belong in Utypes.

On Jun 1, 2008, at 3:35 AM, Jay wrote:

So much for trying plain text to avoid truncation, darnit.> From: jayk123 at hotmail.com> To: m3devel at elegosoft.com> Subject: proposal/insistence for fixed size integer types in Ctypes.i3> Date: Sun, 1 Jun 2008 02:32:27 +0000> > > > Currently the various Utypes.i3 introduce various types LIKE> > > uint8_t = unsigned_char;> uint16_t = unsigned_short;> uint32_t = unsigned_int;> uint64_t = unsigned_long_long;> > > int8_t = signed_char;> int16_t = short;> int32_t = int;> int64_t = long_long;> > > sometimes there is an underscore after the u.> > > There is quite some variation in which, if any, of these types are provided.> When they are provided, they are always the same, with one exception I will detail.> > > Arguably they are provided only for defining other types and function signatures> within m3-libs/m3core/src/unix.> > > I strongly strongly strongly propose that at least the above 8 types go in> Ctypes, and the definitions in Utypes removed.> > > If there was more commonality in Utypes, I'd "forward" them for compatibility,> but there is little commonality. Code depending on these types would have to> be forked a lot. As I said, the types are always the same, if they are defined,> but they are often not defined.> > > One variation I am open to is introducing a new .i3 file.> But in general I like to colocate stuff rather than pick apart everything> and decide an ideal location. There are tradeoffs either way,> though most people only see the tradeoffs in the way I do it.> The tradeoffs the other way are having to track down module after module,> interface after interface, where to get stuff from, rather than having> a "one stop shop", or "fewer shops to stop".> > > I am also willing to have u_* types and CAPITALIZED types:> > > uint8_t = unsigned_char;> uint16_t = unsigned_short;> uint32_t = unsigned_int;> uint64_t = unsigned_long_long;> > > int8_t = signed_char;> int16_t = short;> int32_t = int;> int64_t = long_long;> > > u_int8_t = uint8_t;> u_int16_t = uint16_t;> u_int32_t = uint32_t;> u_int64_t = uint64_t;> > > UINT8 = uint8_t;> UINT16 = uint16_t;> UINT32 = uint32_t;> UINT64 = uint64_t;> > > INT8 = int8_t;> INT16 = int16_t;> INT32 = int32_t;> INT64 = int64_t;> > > All built-in Modula-3 types are capitalized, as all Modula-3 keywords are.> And capitalized types is a style widely used in the Windows headers.> (Windows and Modula-3 share a common heritage -- Digital -- though I don't know> from where the style of capitalized types originates.)> > > The names "int8", "int16" are also obvious candidates, but I feel that some> amount of typographical convention should be used to demark types.> Some amount of "Hungarian", if you will.> Obviously there are vehement opposing opinions on this.> "Hungarian" is often too precise and precludes changing types without> changing names, as well as producing unpronouncable names.> A "weak" form however seems reasonable and useful.> > > These types represent a certain point of view.> It is a common point of view, but not universal.> > > There are roughly three or four perspectives here:> > > 1)> char, short, int, long are abstractly defined and all code should live with it.> char is at least 8 bits, and of unspecified signedness> (limits.h defines CHAR_BIT, the number of bits in char> for specified signedness, use signed char or unsigned char;> I think char has actually three options for its signess -- signed, unsigned, or "half unsigned")> short is at least 16 bits, signed> int is at least 16 bits, signed> long is at least 32 bits, signed> > > There are not necessarily integral types that can hold pointers.> size_t and ptrdiff_t perhaps, but unclear.> size_t can hold the size of anything, but I think "anything" is "any variable"> and not necessarily "the entire address space".> > > ptrdiff_t can hold the result of subtracting pointers, but it is only> valid to subtract pointers that point into the same array or just past it.> > > It is common, for example, but not universal, for the "address space"> to be divided between "user mode" and "kernel mode", often with a 50/50 split,> so therefore size_t could be one bit smaller than a pointer, at least.> Of course that's an "unnatural" size, but theoretically possible.> (This kernel/user 50/50 split is usually exactly how 32 bit and I assume> 64 bit Windows works, though 32 bit Windows can also have a 3 gig / 1 gig split,> and 32 bit Windows code running on 64 bit Windows kernel can get a> full 4 gig address space.)> > > As well, the representation of signed integers is left unspecified.> The range of "int" need only go down to -32767, not necessarily -32768.> Signed magnitude and one's complement are valid representations.> Overflowing a signed integer causes undefined behavior.> Unsigned numbers do not have this abstraction.> > > While this is the "most correct" view, according to (my understanding) the C standard,> implementations do nail down details way beyond this, and a lot of> code depends on these details.> > > While I may have some of those details slightly wrong, you get the point.> You CAN write code within this interface, but a lot of code violates it, sometimes> > > by accident, sometimes for important practical reasons.> Some amount of code assumes an int is at least or exactly 32 bits.> Some amount of code assumes int or long can hold a pointer, though> int probably not so much, and long probably of proportionally> rapidly decreasing instance due to Win64.> > > > 2)> char, short, int, long are somewhat abstractly defined> char is exactly 8 bits> varying perspectives on its presumed signedness> short is exactly 16 bits> int is exactly 32 bits> long there are few perspectives on; it is exactly 32 bits ("Windows"), or> it is exactly the size of a pointer ("Unix"), or it is at least> the size of a pointer> > > As well, two's complement is the only representation of signed numbers> in use, and code depends on this.> > > (I recently read that we can thank the IBM S/360 or such, in the 1960's,> for introducing such modern-day architectural features that everyone> takes for granted as an 8 bit byte and two's complement signed numbers.)> > > If you need an integer with a particular exact size, either use char/short/int directly,> or run them through "autoconf", or sniff "limits.h".> > > 3) This is my recently acquired perspective, but it isn't new.> > > Given that #1 is "correct but rare", and that #2 are> full of "exact":> > > char, short, int, long are funny names with not particularly> useful specifications. #2 is a little sleazy (less so if autoconfed/limits.h)> Unless you are really adhering to the strict spec, don't use them.> If you are in fact indexing a "small" array, they might suffice,> but is it worth it? worth having these types?> > > Theory: 16 bit machines are irrelevant and 32 bit integers> are perfectly efficient on 64 bit machines, and 64 bit integers> are universally available (?) and reasonably efficient (?),> so feel free to use them if there is a need.> > > As well, 4gig remains a large capacity in most contexts, so feel> free to use explictly 32 bit integers.> > > However file sizes and offsets should really always be 64 bits.> Any code still requiring 32 bit file offsets/sizes is unfortunate.> That includes PE32+ imho, the file format for .exes/.dlls on Win64.> > > Be clear and unsleazy and adopt new names that represent well> their specification and actual use.> > > int_t is exactly n bits in size and signed> uint_t is exactly n bits in size and unsigned> some names are chosen for unsigned and signed integers with> the exact size of a pointer> For n=8,16,32 all four types exist, and probably 64.> And pointer-sized types exist.> > > If you really feel your capacity limits should scale with address space size, or need> to store a pointer in an integer, use size_t or uintptr_t or intptr_t, etc.> > > Modula-3's position here adds that INTEGER is the exact> size of a pointer and signed. It is identical to ptrdiff_t> or intptr_t. CARDINAL is the exact size but omits the bottom "half"> of the range, and does not, I believe, extend the top "half".> > > Now, I also realize, that m3-libs/m3core/src/unix is a fairly mechanical> translation of /usr/include, and /usr/include does not necessarily> take perspective #3. So the "funny" names are useful for a human> mechanical translation. But the precise names can still be used instead.> > > Here is an exception I said I would detail:> > > irix-5.2/utypes.i3:> int64_t = RECORD val := ARRAY[0..1] OF int32_t {0,0}; END;> uint64_t = int64_t;> > > This is different in at least two ways that I see.> - default initialization to zero> - 32 bit alignment instead of 64 bit alignment> > > I tend to assume that the alignment is actually wrong,> however all the uses in Usignal appear unaffected, as they are always preceded> by a mix of int64_t and an even number of int32.> Either way, it is easy enough to preserve this for compatibility.> > > I would like to continue, where easy and clear, to reduce the "size" of m3-libs/m3core/src/unix.> Making these types portable available helps that.> For example -- Uin.m3 need not be duplicated at all.> But then it either must use the presently more portable unsigned_short and unsigned,> or uint16_t and uint32_t should be made always available, either by adding them> to all the various Utypes.i3, or the one Ctypes.i3, or a new place.> > > Darwin currently has four Upthread.i3 files (one is dead), but needs either only two, or one> with the sizes abstracted out. I don't know if PPC64_DARWIN will needs its own yet,> I don't have one of these machines yet.> > > I would like to go ahead with this stuff *today*.> It takes some exertion of patience for me to stop and send this first. :)> > > - Jay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20080602/8c2e4309/attachment-0002.html>


More information about the M3devel mailing list