[M3devel] Fwd: Re: Fw: UTF-16: Greek alphabet with CM3

Rodney M. Bates rodney_bates at lcwb.coop
Sat Nov 30 18:02:30 CET 2013


The first post of this seems to have disappeared into cyberspace.


-------- Original Message --------
Subject: Re: [M3devel] Fw: UTF-16: Greek alphabet with CM3
Date: Fri, 29 Nov 2013 13:01:01 -0600
From: Rodney M. Bates <rodney_bates at lcwb.coop>
To: m3devel at elegosoft.com, estellnb at elstel.org

I have created a CVS branch (devel_unicode) that changes WIDECHAR to
a big enough value range to handle all 16_110000 Unicode code points.
I think this may be mostly orthogonal to what you need, as I am sure
the Greek letters are in the BMP (first 16_10000) code points of
Unicode, and thus representable by the current WIDECHAR.

What might be useful to you though, in interfacing to Trestle or any
other GUI system, is that it contains encoders/decoders for
all of the Unicode-specified encodings (UTF-8, UTF-16-BE/LE, and
UTF-32-BE/LE), as well as what I think are the older UCS2-BE/LE,
and Modula-3's current ISO-Latin-1 and WIDECHAR encodings.

The way these work is that WIDECHARs in memory, including within a TEXT,
are fixed sized and are pure Unicode code points, preserving the way
original CHAR works, especially, allowing efficient random access by
character number, not byte number. But the en/decoding is done while
reading/writing streams (Wr.T, Rd,t), which are, as originally, sequential,
and thus suited to the variable-length encodings.

Of course, you could do unusual things, for example, write with UTF-8
encoding to a pipe and decode with Ascii, thus getting the individual
characters of UTF-8 into a TEXT or ARRAY OF CHAR, where you could mess
with the encoded bytes in memory, rather than treating them as code points.

The en/decoders are available in one-code-point-per-call forms, as well
as in filters that are semantically almost identical to Wr and Rd, which
work on whole streams, with a constant encoding.  The latter, in front of
TextWr/TextRd would probably make it easy to interface to any GUI library
that uses in-memory UTF-8, etc.

It also would not be hard to make these work with 16-bit WIDECHAR, with
any input coded beyond UFFFF converted to the standard Unicode replacement
UFFFD.  It sounds like you would be unlikely to use such values.

How soon do you need to start?

On 11/28/2013 12:47 AM, Olaf Wagner wrote:
> I know that there have been controverse discussions about the right
> way to do unicode support.
>
> Is there any way to help Elmar Stellnberger with a quick solution?
>
> Olaf
>
> Begin forwarded message:
>
> Date: Wed, 27 Nov 2013 20:13:54 +0100
> From: Elmar Stellnberger <estellnb at elstel.org>
> To: m3-support at elego.de
> Subject: UTF-16: Greek alphabet with CM3
>
>
> Dear Supporters of CM3, Elegosoft
>
>     Wanting to make use of CM3s new UTF-16 feature I have tried to use
> greek letters by VAL(945+code,WIDECHAR) where 945 is alpha and 946 is
> beta in the unicode table. However I soon had to notice that outputing
> widechars is not supported by Trestle which still seems to be based on
> X11R4. Some analysis of the problem has shown that TextVBT calls
> VBT.PaintText which does however use the CHAR type internally rather
> than Widechar calling PaintSub to output text instead of the X11R6
> function XDrawString16. Is it true that you plan real unicode support
> for CM3 or is there any way to hack the trestle toolkit into allowing
> X11R6 functions? If yes or no how could I convert current X11R6 includes
> for being used with Modula-3? Nonetheless I would really welcome true
> native utf-16 support (as nowadays given by the iso10646-1 X core fonts
> like also 'fixed') because that would be the only way to make use of
> utf-16 for Windows ports. Otherwise I will have to consider whether I
> can use Modula-3 at all for my current project which is about converting
> the automaton simulator www.elstel.org/coan from Object Pascal into a
> currently supported language and publishing it as open source.
> Thanks a lot for your effort!
>
> Yours,
> Elmar Stellnberger
>






More information about the M3devel mailing list