[M3devel] Windows, Unicode file names

Dragiša Durić dragisha at m3w.org
Thu Jun 28 19:19:48 CEST 2012


My language (Serbian) is written with two alphabets. Before ISO-8859-2 we used ten (yes, 10) different encodings to represent our alphabet(s) with 8 bits. With ISO-8859-2 we got solution for Latin alphabet, but we had to use ISO-8859-5 for Cyrillic. One of our ten encodings (national standard come late) covered both Latin and Cyrillic in 8 bit. 

Back in 1991-2 I implemented system for handling above mentioned ten encodings. After that experience, an after decade or so of using/fighting ten encodings, you can trust me - even a notion of having single encoding for all language needs is a lifesaver :). 

That is where my oversensitivity to idea of having two ways to interpret strings comes from. Two ways, just because we can? Ok, we can use two, we can use ten, we can use fifty encodings!! 

But sensible way is to use one, if possible. And it is possible! It is called UTF-8.

On Jun 28, 2012, at 2:51 PM, Hendrik Boom wrote:

> On Wed, Jun 27, 2012 at 01:14:22PM +0200, Dragiša Durić wrote:
>> 
>> On Jun 27, 2012, at 12:19 PM, Jay K wrote:
>> 
>>>> More and more is obvious how ideal structure would be: ARRAY OF CHAR, UTF8 encoded, using SRC M3 Text.Hash().
>>> 
>>> I don't quite agree.
>>> There are two ideal approaches.
>>> 1)
>>>  TEXT is like ARRAY OF CHAR and no values over 0xFF (or maybe even 0x7F) 
>>>  "WiDETEXT" is like ARRAY OF WIDECHAR, for 16bit or 32bit WIDECHAR 
>> 
>> So we can have two representations for single thing: variable holding some text. And representation depends on a question "do you need non-basic-english-characters"?
> 
> I'm  starting to discover that a lot of my English documents have 
> nonAscii chracters in them.  In particular, the separate open and close 
> quotation marks around quoted speech take more than one byte in 
> Unicode.  True, in a starvation-level character set, they are both 
> represented as " , but that's really not what they are.
> 
> -- hendrik




More information about the M3devel mailing list