[M3devel] Simple change to WIDECHAR type

Dragiša Durić dragisha at m3w.org
Sat Jun 30 20:12:45 CEST 2012


I don't see where WIDECHAR can be useful, such as it is. Esp. since TEXT in cm3 is non-flat structure, and it is almost always additional processing to prepare it even for a Windows API argument. 

Additional processing from dendriform cm3  TEXT is in no way more efficient if some nodes are already just-like-Windows-texts. 

Also, cm3 TEXT is overengineered - I hope I don't have to argue this. Everything is second to efficient concat operation.

IMO, we must leave TEXT to be simple and CHAR based. Just like you need for your VLSI tools. And use something like UText.i3/m3 to use such objects to represent Unicode (UTF-8 encoded) any-language strings. And use WText.* for communication with wchar API's like Windows'.

BTW, WIDECHAR literals are non sufficiently defined in cm3. There is a hole size of Moon. What is input encoding for source files containing WIDECHAR literals? For example:

CONST
  Me = W"Dragiša Durić";

Jay, please explain this to me. My editor creates UTF8 files, for example. What cm3 expects after W" ?

On Jun 30, 2012, at 7:24 PM, Mika Nystrom wrote:

> 
> =?utf-8?Q?Dragi=C5=A1a_Duri=C4=87?= writes:
> ...
>> 
>> Solution:
>> =3D=3D=3D=3D=3D=3D
>> 
>> * Redefine WIDECHAR to hold at least 20 bit values, or create UNICHAR or =
>> GLYPH (and leave WIDECHAR as it is for vertical compatibility) so we can =
>> hold unencoded Unicode characters in scalar values in our Modula-3 =
>> programs, while preserving their properties.
>> * Implement properties, relations and methods defined for  Unicode. With =
>> ASCII, numeric order is everything. With Unicode - it is not. This is =
>> probably very big project but we can start somewhere, and let interested =
>> parties build on it. Dirk Muysers did work in this regard already.
>> * Whoever thinks we don't need this and our "tradition" and "legacy" are =
>> important, please read this: =
>> http://unicode.org/standard/WhatIsUnicode.html .
>> 
>> dd
> 
> Given what you have said about the near-uselessness of WIDECHAR, does anything
> actually use it much?  What breaks if it is redefined to be the same as, say,
> INTEGER?  (Or Word.T)
> 
> CHAR is quite useful for processing 7-bit ASCII, and it would be lovely if
> that could go back to using the SRC data structures.  For people who do stuff
> like write VLSI design tools... (probably many other large-scale applications
> would like it too).
> 
>   Mika




More information about the M3devel mailing list