[M3devel] Windows, Unicode file names

Jay K jay.krell at cornell.edu
Mon Jun 25 23:30:08 CEST 2012


 >  Why would you narrow it to 16bit? You need to convert to UTF-16 and make it ready for Windows API calls? Yes.  > WinNLS does that.  I doubt that. There is a 32bit to 16bit conversion?Ok, I guess there is. "Surrogate pairs" and all that?Maybe not in WinNLS, but easy enough for us to write, in portable C or Modula-3. :)Part of Text.i3 perhaps.  So then, I guess I can sign up for WIDECHAR being 32bits across the board.  - Jay
Subject: Re: [M3devel] Windows, Unicode file names
From: dragisha at m3w.org
Date: Mon, 25 Jun 2012 23:09:37 +0200
CC: dabenavidesd at yahoo.es; m3devel at elegosoft.com
To: jay.krell at cornell.edu


On Jun 25, 2012, at 10:17 PM, Jay K wrote:I don't care if WIDECHAR is 16 bits or 32bits, as long as I can convert from
TEXT to a flat array of either, and if 32bits, walk the array, checking for > 0xFFFF, throw an exception or return some error if any found, narrow to 16bits, call some "W" function, free the flat array.
The size can, I guess, vary between Win32 and non-Win32 platforms.

a) If you like to make it as unportable as possible then yes - 16 or 32 is not important.b) invalid value would be over 0xFFFFF, not 0xFFFFc) Why would you narrow it to 16bit? You need to convert to UTF-16 and make it ready for Windows API calls? WinNLS does that. Simple narrowing (similar to commented in Text.i3) to 16bit and recoding from UTF-32 to UTF-16 is very different thing.d) Size varies, yes.
Its size should be stored in a global to communicate between Modula-3 and C.
 
 
I'd also quite like if TEXT was internally represented as a nul terminated flat array of 8 and/or 16 and/or 32bit quantities, materialzing on demand some of them. But I suspect that flat and readonly and exposing a concat operation are in conflict. I'm not sure. MFC uses a flat reference counted nul terminated representation and it works pretty well. It doesn't materialize-on-demand other widths.
 
 - Jay 
Subject: Re: [M3devel] Windows, Unicode file names
From: dragisha at m3w.org
Date: Mon, 25 Jun 2012 21:48:09 +0200
CC: dabenavidesd at yahoo.es; m3devel at elegosoft.com
To: jay.krell at cornell.edu

It can be what cm3 people had in mind when they created WIDECHAR as a catchall for Unicode.
At first glance it looked like no solution to me, but after counting to ten - I think it is. We can have an UTF-8 layer and use it when and where needed, to recode our strings to catchall WIDECHAR/WIDETEXT.
As long as we agree on what exacty WIDECHAR is :)===From wikipediaThe Microsoft Windows application programming interfaces Win32 and Win64, as well as the Java and .Net Framework platforms, require that wide character variables be defined as 16-bit values, and that characters be encoded using UTF-16 (due to former use of UCS-2), while modern Unix-like systems generally require 32-bit values encoded using UTF-32[citation needed].===

On Jun 25, 2012, at 9:39 PM, Jay K wrote:I think I know what to do here and will look into it..later..
 
We have TEXT. We should just always get WIDECHARs out of it and call CreateFileW.
Assuming UTF8 is the wrong solution at this level, and passing in UTF8 won't work with the correct solution.
A layer above this needs to decode UTF8, if that is the encoding.
 
Unless someone has declared and implemented that TEXT is in fact always UTF8-encoded, which I doubt.
 
 - Jay 
From: dragisha at m3w.org
Date: Mon, 25 Jun 2012 21:05:59 +0200
To: dabenavidesd at yahoo.es
CC: m3devel at elegosoft.com
Subject: Re: [M3devel] Windows, Unicode file names

If you cared enough to check FSWin32.m3, answer would be obvious :).
Whatever I do with pathname before I call FS.OpenFile(Readonly)? - FSWin32.m3 will call CreateFileA. My solution is:
PROCEDURE OpenFileReadonly(p: Pathname.T): File.T RAISES {OSError.E}=  VAR    handle: WinNT.HANDLE;    fname := M3toC.SharedTtoS(p);    dwNum := WinNLS.MultiByteToWideChar (WinNLS.CP_UTF8, 0, fname, -1, NIL, 0);    pwText: WinBaseTypes.PCWSTR;   BEGIN    IF dwNum = 0 OR dwNum = Text.Length(p) + 1 THEN      (* dwNum includes terminating null character. that's +1 above.      *)      handle := WinBase.CreateFile(                    lpFileName := fname,                    dwDesiredAccess := WinNT.GENERIC_READ,                    dwShareMode :=  WinNT.FILE_SHARE_READ,                    lpSecurityAttributes := NIL,                    dwCreationDisposition := WinBase.OPEN_EXISTING,                    dwFlagsAndAttributes := 0,                    hTemplateFile := NIL);    ELSE      pwText := LOOPHOLE(NEW(UNTRACED REF ARRAY OF CHAR, dwNum*2), WinBaseTypes.PCWSTR);      EVAL WinNLS.MultiByteToWideChar (WinNLS.CP_UTF8, 0, fname, -1, pwText, dwNum);      handle := WinBase.CreateFileW(                    lpFileName := pwText,                    dwDesiredAccess := WinNT.GENERIC_READ,                    dwShareMode := WinNT.FILE_SHARE_READ,                    lpSecurityAttributes := NIL,                    dwCreationDisposition := WinBase.OPEN_EXISTING,                    dwFlagsAndAttributes := 0,                    hTemplateFile := NIL);      DISPOSE(pwText);    END;
    IF LOOPHOLE(handle, INTEGER) = WinBase.INVALID_HANDLE_VALUE THEN      Fail(p, fname);    END;    M3toC.FreeSharedS(p, fname);    RETURN FileWin32.New(handle, FileWin32.Read)  END OpenFileReadonly;
And similar in OpenFile. Not nice :).
Also, I've added CP_UTF8 constant to WinNLS.i3.
On Jun 25, 2012, at 9:01 PM, Daniel Alejandro Benavides D. wrote:Hi all:
So do you need Double-Byte Character String module as currently in TEXT types? but you can do that already. Couldn't you?
Thanks in advance

--- El lun, 25/6/12, Dragiša Durić <dragisha at m3w.org> escribió:

De: Dragiša Durić <dragisha at m3w.org>
Asunto: Re: [M3devel] Windows, Unicode file names
Para: "Daniel Alejandro Benavides D." <dabenavidesd at yahoo.es>
CC: "m3devel" <m3devel at elegosoft.com>
Fecha: lunes, 25 de junio, 2012 13:20

Yes, they exposed parts of NLS. That's how problem can be, albeit partially, solved. By using methods exposed there.
What we don't have is how to communicate actual encoding of string to FS module so FS methods can handle filenames accordingly.
On Jun 25, 2012, at 8:06 PM, Daniel Alejandro Benavides D. wrote:Hi all:
OK, good, Win32 API dealt with inter-NLS (National Language Support) at ASCII and other formats level with NLS API.
But it appears to be have not been used for DEC-SRC WinNT port of Modula-3 (but for CM3, though it isn't compiled in elego servers, but here):
http://www.cs.purdue.edu/homes/hosking/m3/help/gen_html/m3core/src/win32/WinNLS.i3.html

Thanks in advance

--- El lun, 25/6/12, Dragiša Durić <dragisha at m3w.org> escribió:

De: Dragiša Durić <dragisha at m3w.org>
Asunto: Re: [M3devel] Windows, Unicode file names
Para: "Daniel Alejandro Benavides D." <dabenavidesd at yahoo.es>
CC: "m3devel" <m3devel at elegosoft.com>
Fecha: lunes, 25 de junio, 2012 12:36

Daniel,
I can talk about many things, and most things Modula-3 are of interest to me. Once you start a topic, and I can understand what is it about, and it meets my interests - I'll be there.
Problem I met with filenames is nothing old. Windows can open files with filenames in ASCII and UTF-16. Everything else - you must check twice and do a workaround.
I've written here in hope I can get i to some fruitful discussion with people who understand this problem. My solution is a workaround and assumes filename is UTF-8 or ASCII. I would like to start discussion on this and work from there to more general solution.
dd
On Jun 25, 2012, at 7:27 PM, Daniel Alejandro Benavides D. wrote:Hi all:
I as I understood, thought you don't want to talk about compatible W 95 / NT distro of Modula-3.
But in turn you want to keep compatibility with older file name encodes.
I don't care that but if its useful anyway (because newer windows don't care at all either) I don't know know your problem was because it won't be able to be solved!
Thanks in advance
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20120625/fe27c2f9/attachment-0002.html>


More information about the M3devel mailing list