[M3devel] (draft draft proposal) Unicode TEXT via BITSIZE(CHAR) = 32

Mika Nystrom mika at async.caltech.edu
Fri Dec 26 09:56:24 CET 2008


I think you may find "SET OF CHAR" hidden all over the place.

Also a lot of code may depend on the fact that CHAR and C's "char"
types match on a given platform.  This is your UNSAFE objection,
but I think it goes further than just buffers.

    Mika


=?UTF-8?Q?Dragi=C5=A1a_Duri=C4=87?= writes:
>Basic building element of TEXT is CHAR. So, if we extend TEXT to Unicode
>(via UTF-8 as internal rep) then we must also extend CHAR so it can
>represent any single Unicode glyph – in fact CHAR becomes 32bit value
>instead of current 8bit.
>
>If we insist to preserve BITSIZE(CHAR) = 8 (and I don't see why) then we
>are on UNICHAR route – as proposed by Darko (IIRC). But – down that road
>we have variations in very traditional interface Text.i3 – Text.GetChar,
>Text.SetChars, Text.FromChar, Text.FromChars must have these two
>variants – somehow. I see no elegant way to handle that if we insist on
>BITSIZE(CHAR) = 8.
>
>UNICHAR route also contains other branches, most of them analog to
>current Text8/Text16 mess.
>
>UNSAFE code written with various ARRAY OF CHAR which are, in fact, byte
>buffers, is one problem. Not too hard to spot and fix, though.
>
>Current TEXT/WIDETEXT was implemented because CMASS JVM needed it that
>way. If similar need happens in future, ie some runtime level data
>communication, I think we can do it at “connection” level. Some
>marshalling would most probably always take place – so why not add TEXT
>I/O to list of tasks needed?
>
>-- 
>Dragiša Durić <dragisha at m3w.org>



More information about the M3devel mailing list