[M3devel] UI libraries (also WIDECHAR/cm3 TEXT)

Dirk Muysers dmuysers at hotmail.com
Fri Apr 26 11:17:59 CEST 2013


>> Dirk, are you covering Unicode Collate with your Unicode implementation? Except for Unicode
>> tables (and your earlier implementation of this is very useful) and UTF handling/operations
>> (I have this, very complete), Unicode Collate remain biggest remaining obstacle for full incorporation
>> of Unicode/UTF8 into cm3.

Full Unicode collation implies normalization and a lot more tables (and tools to extract these tables
from the Unicode data base) The only libraries I know about that do it are the (rather monstruous)
IBM library and, to a certain extent, glib. Even go(lang) doesn't offer it.

Normal comparison associated with simple case folding (which is part of my library) is a first step
in that direction. Simple case folding folds only 1:1 UC/LC pairs. The emblematic special case
being German eszet that folds to SS, but even that case is now covered by the inclusion of a special
eszet upper case glyph in recent Unicode releases, so that most European languages are now
covered. Languages that still need special processing are the Turkic family (Turkish and Azeri).

So, as long as one compares only own text files (which do not mingle accented glyphs and
decomposed glyphs), one gets an acceptable collation.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20130426/3559ed62/attachment-0002.html>


More information about the M3devel mailing list