[M3devel] cm3 does not support Scan.LongInt

Jay K jay.krell at cornell.edu
Mon Dec 16 10:31:57 CET 2013


 > like f.i. UCHAR in order not to break code that does rely on the current 16bit character width.
 
 
It might as well be UINT32 or UInt32?
 
 
In C++, std::string is std::basic_string<char>, std::wstring = std::basic_string<wchar_t>.
Could/should we use generics similarly?
 
 
CharText, WIDECHAR=UINT16, WideCharText, UInt32Text?
 
 
There is the problem of text literals.
The current "text" can change between "char" and "widechar".
 
 
Can "widechar" vary per-target?
In particular, I think C/C++ wchar_t is 32bits on some platforms, so it might be reasonable for Modula-3 WIDECHAR to match it?
(I just checked -- Linux/amd64 does have 32bit wchar_t).
 
 
It is a thorny issue though, there are pluses and minuse either way.
An alternative would be to have WIDECHAR be the same for all targets.

 
 - Jay
 
> Date: Sun, 15 Dec 2013 09:40:26 -0600
> From: rodney_bates at lcwb.coop
> To: m3devel at elegosoft.com
> Subject: Re: [M3devel] cm3 does not support Scan.LongInt
> 
> 
> 
> On 12/14/2013 04:45 AM, Elmar Stellnberger wrote:
> > Converting my automaton simulator I have just discovered that there is no Scan.LongInt though there is a Fmt.LongInt.
> > Would anyone mind this to fix in the main trunk?
> >
> > Also I hope that there will be a ready-to-use GUI as soon as I will come into implementing the GUI part of the simulator.
> > I had a superfluous look at Modula-3 Qt but I am not yet sure on how the signal concept was mapped onto Modula-3.
> > How do I f.i. listen to a 'pressed' signal on a QPushButton?
> > There is no 'pressed' method or procedure variable in the QAbstractButton interface which I could override.
> > Daniel, could you have a look at it? At worst the Qt port could be infunctional (Sorry, I haven`t tried that yet.).
> > Also I believe a full integration of Randys Trestle port could give us an additional backup if something did not work with the Qt port.
> >
> > and: Concerning the widechar support, 16bit characters will just be fine for Qt as it does internally use UTF-16 and thus UTF-16 character arrays can be directly converted into a QString. So if we would choose to introduce a 32bit character type I would give it another name like f.i. UCHAR in order not to break code that does rely on the current 16bit character width.
> >
> > Elmar
> >
> 
> Just to be sure you understand, Modula-3 arrays of current-sized WIDECHAR
> are not UTF-16 character arrays.  The former can only represent characters
> whose code points are <= 16_FFFF.  UTF-16 encodes up to 16_10FFFF, by using
> two 16-bit code units for one character in the upper part of the range.
> This also means that the codes in the two code units are surrogates
> (16_D800..16_DFFF) and cannot be used as unencoded characters.
> 
> For output, you could deal with this by just avoiding putting any surrogate
> values into a WIDECHAR (and, of course, no values beyond 16_FFFF, since they
> won't fit).  But for input, any correctly implemented library that gives you
> UTF-16 strings could contain these, and probably you can't prevent that, because
> they can come all the way from a human user.
> 
> So you would have to treat the WIDECHARs as code units, not code points, and
> write your own decoder.  But then you would need a type to hold the decoded
> values, and WIDECHAR is not big enough, so it would have to be INTEGER or a
> subrange, and now you can't use the literals without conversions.  Moreover,
> you can't use TEXT, with its easy-to-use functional style implementation of
> various string operations.  I suppose you could write it so it just rejects
> or replaces high-valued code points at the decode stage and tell your users
> they can't use these characters.
> 
> Or, you could just assume neither your application nor your users will ever
> need to use codes where 16-bit WIDECHAR and UTF-16 differ, and let it be buggy,
> if the assumption is ever violated.
> 
> Also, does your preferred GUI allow you to specify that the UTF-16 strings
> you give it and get from it always have little endian code units, regardless
> of the native endianness of the machine?  This is the way Modula-3 WIDECHARs
> work.  If not, your application would only work on little-endian machines.
> 
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20131216/9bba3365/attachment-0002.html>


More information about the M3devel mailing list