[M3devel] cm3 does not support Scan.LongInt

Sun Dec 15 16:40:26 CET 2013

On 12/14/2013 04:45 AM, Elmar Stellnberger wrote:
> Converting my automaton simulator I have just discovered that there is no Scan.LongInt though there is a Fmt.LongInt.
> Would anyone mind this to fix in the main trunk?
>
> Also I hope that there will be a ready-to-use GUI as soon as I will come into implementing the GUI part of the simulator.
> I had a superfluous look at Modula-3 Qt but I am not yet sure on how the signal concept was mapped onto Modula-3.
> How do I f.i. listen to a 'pressed' signal on a QPushButton?
> There is no 'pressed' method or procedure variable in the QAbstractButton interface which I could override.
> Daniel, could you have a look at it? At worst the Qt port could be infunctional (Sorry, I haven`t tried that yet.).
> Also I believe a full integration of Randys Trestle port could give us an additional backup if something did not work with the Qt port.
>
> and: Concerning the widechar support, 16bit characters will just be fine for Qt as it does internally use UTF-16 and thus UTF-16 character arrays can be directly converted into a QString. So if we would choose to introduce a 32bit character type I would give it another name like f.i. UCHAR in order not to break code that does rely on the current 16bit character width.
>
> Elmar
>

Just to be sure you understand, Modula-3 arrays of current-sized WIDECHAR
are not UTF-16 character arrays.  The former can only represent characters
whose code points are <= 16_FFFF.  UTF-16 encodes up to 16_10FFFF, by using
two 16-bit code units for one character in the upper part of the range.
This also means that the codes in the two code units are surrogates
(16_D800..16_DFFF) and cannot be used as unencoded characters.

For output, you could deal with this by just avoiding putting any surrogate
values into a WIDECHAR (and, of course, no values beyond 16_FFFF, since they
won't fit).  But for input, any correctly implemented library that gives you
UTF-16 strings could contain these, and probably you can't prevent that, because
they can come all the way from a human user.

So you would have to treat the WIDECHARs as code units, not code points, and
write your own decoder.  But then you would need a type to hold the decoded
values, and WIDECHAR is not big enough, so it would have to be INTEGER or a
subrange, and now you can't use the literals without conversions.  Moreover,
you can't use TEXT, with its easy-to-use functional style implementation of
various string operations.  I suppose you could write it so it just rejects
or replaces high-valued code points at the decode stage and tell your users
they can't use these characters.

Or, you could just assume neither your application nor your users will ever
need to use codes where 16-bit WIDECHAR and UTF-16 differ, and let it be buggy,
if the assumption is ever violated.

Also, does your preferred GUI allow you to specify that the UTF-16 strings
you give it and get from it always have little endian code units, regardless
of the native endianness of the machine?  This is the way Modula-3 WIDECHARs
work.  If not, your application would only work on little-endian machines.