[M3devel] 64bit INTEGERs, WIDECHAR: language specified or configuration/target dependent?
Jay K
jay.krell at cornell.edu
Thu May 28 07:52:51 CEST 2015
I believe the proposal is that INTEGER be more like int in C and C+ -- 32bits always.
> I do have the impression that some Windows targets are not currently
All targets have a 64bit LONGINT. I implemented this years ago in the NT/x86 backend.
The C backend depends on the underlying C compiler having __int64 or long long or a 64bit long.
All targets have BYTESIZE(INTEGER) == BYTESIZE(ADDRESS).
In C and C++, long is smaller than a pointer on Win64.
On Windows, int and long are always 32 bits.
This might be the case at least sort of on VMS and HP-UX, at least when the
upper bits of 64bit pointers are guaranteed all zeros or all ones or ignored.
They have such environments -- 32bit-ish pointers on Alpha, 32bit-ish pointers on IA64.
I started writing up more detail.
There are obviously 8 underlying integer types: unsigned/signed 8/16/32/64
A typical C99 environment has around 40 names for them:
stdint.h: 8+16+32+64 x signed/unsigned x fast+least+exact: 24
short/int/long/long long; unsigned: 8
char/unsigned char/signed char: 3
[u]intptr_t: 2
size_t/ptrdiff_t: 2
wchar_t: 1
=> 40
In C++, some of them are distinct for overloading/mangling purposes.
Windows has far more, like almost double, for various historical and typographical and stylistic reasons.
e.g. INT, UINT, LONG, ULONG, WCHAR, CHAR, UCHAR, BYTE, WORD, DWORD, DWORD_PTR, INT_PTR, UINT_PTR, SIZE_T, HALF_PTR, etc.
But still the same underlying 8 types.
It can be difficult to chose the right type.
It is tempting to discount many of them, and many can be discounted, but it hard
to really narrow the list to be small.
Should we add some more builtins to the frontend?
Like [U]INT[8,16,32,64]?
Or lowercase and _t?
- Jay
> Subject: Re: [M3devel] 64bit INTEGERs, WIDECHAR: language specified or configuration/target dependent?
> From: hosking at purdue.edu
> Date: Thu, 28 May 2015 05:03:39 +0300
> CC: rodney.m.bates at acm.org; jay.krell at cornell.edu; m3devel at elegosoft.com
> To: estellnb at elstel.org
>
> BYTESIZE(ADDRESS) = BYTESIZE(INTEGER) in cm3 on all target platforms. I don't really understand what you are proposing.
>
> Sent from my iPhone
>
> > On May 27, 2015, at 7:32 PM, Elmar Stellnberger <estellnb at elstel.org> wrote:
> >
> > Enough words about the history, now let us see how we can profit
> > from both kinds of types when we wanna step on virgin soil:
> >
> > In what way we may ever turn things there actually needs to be
> > a target sized type which is uses to be unsigned: the pointer.
> > However there needs to be a way to do certain address
> > calculations manually, apart from array indexing:
> > multiply, add, subtract & evtl. shift.
> > I would also believe that it would be handy to have such a type
> > signed.
> > i.e. offset = adress1 - adress2
> >
> > Naturally such a type will profit from extending its value range
> > to the bit size of pointers.
> > Up to now converting everything to an int has sufficed. However
> > it will no more for a 64bit arch.
> > Will we need to convert to a LONGINT then? - but that will be in-
> > compatible as LONGINT currently takes the 'l'-suffix and longint
> > is not even supported for the 32bit arch as far as I know.
> >
> > Having an own type for this and other purposes like optimized
> > numeric code would to my believe be beneficial.
> > Call it OFFSET, TARGETINT, TargetInt.T or Offset.T
> > Whether to just support such a type by a Word.T like interface
> > or by a built-in type would likely be worth another discussion.
> >
> > So what for now? As I recall things we have introduced
> > a LONGINT which takes the 0l - suffix for AMD64 only.
> >
> > The first thing would be to introduce a 64bit LONGINT for x86/32bit.
> >
> > and then?
> > TYPE Offset.T = BITS BITSIZE(ADDRESS) FOR LONGINT ?
> >
> > We will have to rewrite some code that assumed offsets to be
> > integers, then.
> >
> > The other possibility we have would be to make an offset a built-in
> > type and assignment compatible to both int and longint which will
> > save us from rewriting too much old code. I would claim this not to
> > be a too big problem as converting back and forth between an
> > OFFSET and an [LONG]INT should rarely happen. It would only
> > be used in unsafe interfaces as all address arithmetics
> > i.e. we should at least make that require an explicit conversion
> > outside of unsafe interfaces. That way all expressions remained
> > 100% compatible while only having to declare certain variables
> > as OFFSET rather than INTEGER.
> >
> >
> >>
> >> Am 22.05.2015 um 19:55 schrieb Rodney M. Bates:
> >
> >>> The evolving nature of first UCS and then Unicode standards has left
> >>> many language designers knocked off balance. Critical Mass first
> >>> introduced WIDECHAR as 16-bit when that was what everybody thought
> >>> was enough. Then things changed, and it wasn't anymore. Right now,
> >>> it's a configuration parameter (must be the same for the entire link
> >>> closure) in Modula-3. I personally favor making it full Unicode
> >>> by default, in the next release, as this is where the world is now.
> >>> This is hopefully a simpler problem than INTEGER, etc., because, as of
> >>> now, the Unicode committee has emphatically assured us that the range will
> >>> *never* increase. We can hope.
> >
> > By now I welcome your decision to make the WIDECHAR 32bit!
> > I believe it should become the default for the upcoming release.
> > Pure Modula-3 code will take advantage of the new value range.
> > Just interfacing with certain external toolkits is not enough
> > justification to freeze things as they are - interfaces need to be
> > adapted anyway while supporting all three types is just too much
> > unnecessary work.
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20150528/a20e73cc/attachment-0002.html>
More information about the M3devel
mailing list