<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Tahoma
}
--></style></head>
<body class='hmmessage'><div dir='ltr'>
> Why would you narrow it to 16bit? You need to convert to UTF-16 and make it ready for Windows API calls?<BR> <BR>Yes.<BR> <BR> > WinNLS does that.<BR> <BR> <BR>I doubt that. There is a 32bit to 16bit conversion?<BR>Ok, I guess there is. "Surrogate pairs" and all that?<BR>Maybe not in WinNLS, but easy enough for us to write, in portable C or Modula-3. :)<BR>Part of Text.i3 perhaps.<BR> <BR> <BR>So then, I guess I can sign up for WIDECHAR being 32bits across the board.<BR> <BR> - Jay<br><BR><div><div id="SkyDrivePlaceholder"></div><hr id="stopSpelling">Subject: Re: [M3devel] Windows, Unicode file names<br>From: dragisha@m3w.org<br>Date: Mon, 25 Jun 2012 23:09:37 +0200<br>CC: dabenavidesd@yahoo.es; m3devel@elegosoft.com<br>To: jay.krell@cornell.edu<br><br><br><div><div>On Jun 25, 2012, at 10:17 PM, Jay K wrote:</div><br class="ecxApple-interchange-newline"><blockquote><span class="ecxApple-style-span" style="font:/normal Helvetica; text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; white-space: normal; border-collapse: separate; orphans: 2; widows: 2;"><div class="ecxhmmessage" style="font-family: Tahoma; font-size: 10pt;"><div dir="ltr">I don't care if WIDECHAR is 16 bits or 32bits, as long as I can convert from<br>TEXT to a flat array of either, and if 32bits, walk the array, checking for > 0xFFFF, throw an exception or return some error if any found, narrow to 16bits, call some "W" function, free the flat array.<br>The size can, I guess, vary between Win32 and non-Win32 platforms.<br></div></div></span></blockquote><div><br></div>a) If you like to make it as unportable as possible then yes - 16 or 32 is not important.</div><div>b) invalid value would be over 0xFFFFF, not 0xFFFF</div><div>c) Why would you narrow it to 16bit? You need to convert to UTF-16 and make it ready for Windows API calls? WinNLS does that. Simple narrowing (similar to commented in Text.i3) to 16bit and recoding from UTF-32 to UTF-16 is very different thing.</div><div>d) Size varies, yes.</div><div><br><blockquote><span class="ecxApple-style-span" style="font:/normal Helvetica; text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; white-space: normal; border-collapse: separate; orphans: 2; widows: 2;"><div class="ecxhmmessage" style="font-family: Tahoma; font-size: 10pt;"><div dir="ltr">Its size should be stored in a global to communicate between Modula-3 and C.<br> <br> <br>I'd also quite like if TEXT was internally represented as a nul terminated flat array of 8 and/or 16 and/or 32bit quantities, materialzing on demand some of them. But I suspect that flat and readonly and exposing a concat operation are in conflict. I'm not sure. MFC uses a flat reference counted nul terminated representation and it works pretty well. It doesn't materialize-on-demand other widths.<br> <br> - Jay <br><div><div id="ecxSkyDrivePlaceholder"></div><hr id="ecxstopSpelling">Subject: Re: [M3devel] Windows, Unicode file names<br>From: <a href="mailto:dragisha@m3w.org">dragisha@m3w.org</a><br>Date: Mon, 25 Jun 2012 21:48:09 +0200<br>CC: <a href="mailto:dabenavidesd@yahoo.es">dabenavidesd@yahoo.es</a>; <a href="mailto:m3devel@elegosoft.com">m3devel@elegosoft.com</a><br>To: <a href="mailto:jay.krell@cornell.edu">jay.krell@cornell.edu</a><br><br><div>It can be what cm3 people had in mind when they created WIDECHAR as a catchall for Unicode.</div><div><br></div><div>At first glance it looked like no solution to me, but after counting to ten - I think it is. We can have an UTF-8 layer and use it when and where needed, to recode our strings to catchall WIDECHAR/WIDETEXT.</div><div><br></div><div>As long as we agree on what exacty WIDECHAR is :)</div><div>===From wikipedia</div><div>The Microsoft Windows<span class="ecxApple-converted-space"> </span><a title="Application programming interface" href="http://en.wikipedia.org/wiki/Application_programming_interface" target="_blank">application programming interfaces</a><span class="ecxApple-converted-space"> </span><a title="Win32" class="ecxmw-redirect" href="http://en.wikipedia.org/wiki/Win32" target="_blank">Win32</a><span class="ecxApple-converted-space"> </span>and<span class="ecxApple-converted-space"> </span><a title="Win64" class="ecxmw-redirect" href="http://en.wikipedia.org/wiki/Win64" target="_blank">Win64</a>, as well as the<span class="ecxApple-converted-space"> </span><a title="Java (software platform)" href="http://en.wikipedia.org/wiki/Java_%28software_platform%29" target="_blank">Java</a><span class="ecxApple-converted-space"> </span>and<span class="ecxApple-converted-space"> </span><a title=".Net Framework" class="ecxmw-redirect" href="http://en.wikipedia.org/wiki/.Net_Framework" target="_blank">.Net Framework</a><span class="ecxApple-converted-space"> </span>platforms, require that wide character variables be defined as 16-bit values, and that characters be encoded using<span class="ecxApple-converted-space"> </span><a title="UTF-16" href="http://en.wikipedia.org/wiki/UTF-16" target="_blank">UTF-16</a><span class="ecxApple-converted-space"> </span>(due to former use of UCS-2), while modern<span class="ecxApple-converted-space"> </span><a title="Unix" href="http://en.wikipedia.org/wiki/Unix" target="_blank">Unix</a>-like systems generally require 32-bit values encoded using<span class="ecxApple-converted-space"> </span><a title="UTF-32" href="http://en.wikipedia.org/wiki/UTF-32" target="_blank">UTF-32</a><sup class="ecxTemplate-Fact" style="white-space: nowrap;">[<i><a title="Wikipedia:Citation needed" href="http://en.wikipedia.org/wiki/Wikipedia:Citation_needed" target="_blank"><span title="This claim needs references to reliable sources from May 2012">citation needed</span></a></i>]</sup>.</div><div>===</div><div><br></div><div><br><div><div>On Jun 25, 2012, at 9:39 PM, Jay K wrote:</div><br class="ecxApple-interchange-newline"><blockquote><div class="ecxhmmessage" style="font-family: Tahoma; font-size: 10pt;"><div dir="ltr">I think I know what to do here and will look into it..later..<br> <br>We have TEXT. We should just always get WIDECHARs out of it and call CreateFileW.<br>Assuming UTF8 is the wrong solution at this level, and passing in UTF8 won't work with the correct solution.<br>A layer above this needs to decode UTF8, if that is the encoding.<br> <br>Unless someone has declared and implemented that TEXT is in fact always UTF8-encoded, which I doubt.<br> <br> - Jay <br><div><div id="ecxSkyDrivePlaceholder"></div><hr id="ecxstopSpelling">From:<span class="ecxApple-converted-space"> </span><a href="mailto:dragisha@m3w.org">dragisha@m3w.org</a><br>Date: Mon, 25 Jun 2012 21:05:59 +0200<br>To:<span class="ecxApple-converted-space"> </span><a href="mailto:dabenavidesd@yahoo.es">dabenavidesd@yahoo.es</a><br>CC:<span class="ecxApple-converted-space"> </span><a href="mailto:m3devel@elegosoft.com">m3devel@elegosoft.com</a><br>Subject: Re: [M3devel] Windows, Unicode file names<br><br>If you cared enough to check FSWin32.m3, answer would be obvious :).<div><br></div><div>Whatever I do with pathname before I call FS.OpenFile(Readonly)? - FSWin32.m3 will call CreateFileA. My solution is:</div><div><br></div><div><div>PROCEDURE OpenFileReadonly(p: Pathname.T): File.T RAISES {OSError.E}=</div><div> VAR</div><div> handle: WinNT.HANDLE;</div><div> fname := M3toC.SharedTtoS(p);</div><div> dwNum := WinNLS.MultiByteToWideChar (WinNLS.CP_UTF8, 0, fname, -1, NIL, 0);</div><div> pwText: WinBaseTypes.PCWSTR; </div><div> BEGIN</div><div> IF dwNum = 0 OR dwNum = Text.Length(p) + 1 THEN</div><div> (* dwNum includes terminating null character. that's +1 above.</div><div> *)</div><div> handle := WinBase.CreateFile(</div><div> lpFileName := fname,</div><div> dwDesiredAccess := WinNT.GENERIC_READ,</div><div> dwShareMode := WinNT.FILE_SHARE_READ,</div><div> lpSecurityAttributes := NIL,</div><div> dwCreationDisposition := WinBase.OPEN_EXISTING,</div><div> dwFlagsAndAttributes := 0,</div><div> hTemplateFile := NIL);</div><div> ELSE</div><div> pwText := LOOPHOLE(NEW(UNTRACED REF ARRAY OF CHAR, dwNum*2), WinBaseTypes.PCWSTR);</div><div> EVAL WinNLS.MultiByteToWideChar (WinNLS.CP_UTF8, 0, fname, -1, pwText, dwNum);</div><div> handle := WinBase.CreateFileW(</div><div> lpFileName := pwText,</div><div> dwDesiredAccess := WinNT.GENERIC_READ,</div><div> dwShareMode := WinNT.FILE_SHARE_READ,</div><div> lpSecurityAttributes := NIL,</div><div> dwCreationDisposition := WinBase.OPEN_EXISTING,</div><div> dwFlagsAndAttributes := 0,</div><div> hTemplateFile := NIL);</div><div> DISPOSE(pwText);</div><div> END;</div><div><br></div><div> IF LOOPHOLE(handle, INTEGER) = WinBase.INVALID_HANDLE_VALUE THEN</div><div> Fail(p, fname);</div><div> END;</div><div> M3toC.FreeSharedS(p, fname);</div><div> RETURN FileWin32.New(handle, FileWin32.Read)</div><div> END OpenFileReadonly;</div><div><br></div><div>And similar in OpenFile. Not nice :).</div><div><br></div><div>Also, I've added CP_UTF8 constant to WinNLS.i3.</div><div><br><div><div>On Jun 25, 2012, at 9:01 PM, Daniel Alejandro Benavides D. wrote:</div><br class="ecxApple-interchange-newline"><blockquote><table border="0" cellSpacing="0" cellPadding="0"><tbody><tr><td vAlign="top" style="font: inherit; font-size-adjust: inherit; font-stretch: inherit;">Hi all:<br>So do you need Double-Byte Character String module as currently in TEXT types? but you can do that already. Couldn't you?<br>Thanks in advance<br><br>--- El<span class="ecxApple-converted-space"> </span><b>lun, 25/6/12, Dragiša Durić<span class="ecxApple-converted-space"> </span><i><<a href="mailto:dragisha@m3w.org">dragisha@m3w.org</a>></i></b><span class="ecxApple-converted-space"> </span>escribió:<br><blockquote style="padding-left: 5px; margin-left: 5px;"><br>De: Dragiša Durić <<a href="mailto:dragisha@m3w.org">dragisha@m3w.org</a>><br>Asunto: Re: [M3devel] Windows, Unicode file names<br>Para: "Daniel Alejandro Benavides D." <<a href="mailto:dabenavidesd@yahoo.es">dabenavidesd@yahoo.es</a>><br>CC: "m3devel" <<a href="mailto:m3devel@elegosoft.com">m3devel@elegosoft.com</a>><br>Fecha: lunes, 25 de junio, 2012 13:20<br><br><div id="ecxyiv395665588"><div>Yes, they exposed parts of NLS. That's how problem can be, albeit partially, solved. By using methods exposed there.<div><br></div><div>What we don't have is how to communicate actual encoding of string to FS module so FS methods can handle filenames accordingly.</div><div><br></div><div><div><div>On Jun 25, 2012, at 8:06 PM, Daniel Alejandro Benavides D. wrote:</div><br class="ecxyiv395665588Apple-interchange-newline"><blockquote><table border="0" cellSpacing="0" cellPadding="0"><tbody><tr><td vAlign="top" style="font: inherit; font-size-adjust: inherit; font-stretch: inherit;">Hi all:<br>OK, good, Win32 API dealt with inter-NLS (National Language Support) at ASCII and other formats level with NLS API.<br>But it appears to be have not been used for DEC-SRC WinNT port of Modula-3 (but for CM3, though it isn't compiled in elego servers, but here):<br><a href="http://www.cs.purdue.edu/homes/hosking/m3/help/gen_html/m3core/src/win32/WinNLS.i3.html" target="_blank" rel="nofollow">http://www.cs.purdue.edu/homes/hosking/m3/help/gen_html/m3core/src/win32/WinNLS.i3.html</a><br><br>Thanks in advance<br><br>--- El<span class="ecxApple-converted-space"> </span><b>lun, 25/6/12, Dragiša Durić<span class="ecxApple-converted-space"> </span><i><<a href="mailto:dragisha@m3w.org">dragisha@m3w.org</a>></i></b><span class="ecxApple-converted-space"> </span>escribió:<br><blockquote style="padding-left: 5px; margin-left: 5px;"><br>De: Dragiša Durić <<a href="mailto:dragisha@m3w.org">dragisha@m3w.org</a>><br>Asunto: Re: [M3devel] Windows, Unicode file names<br>Para: "Daniel Alejandro Benavides D." <<a href="mailto:dabenavidesd@yahoo.es">dabenavidesd@yahoo.es</a>><br>CC: "m3devel" <<a href="mailto:m3devel@elegosoft.com">m3devel@elegosoft.com</a>><br>Fecha: lunes, 25 de junio, 2012 12:36<br><br><div id="ecxyiv395665588"><div>Daniel,<div><br></div><div>I can talk about many things, and most things Modula-3 are of interest to me. Once you start a topic, and I can understand what is it about, and it meets my interests - I'll be there.</div><div><br></div><div>Problem I met with filenames is nothing old. Windows can open files with filenames in ASCII and UTF-16. Everything else - you must check twice and do a workaround.</div><div><br></div><div>I've written here in hope I can get i to some fruitful discussion with people who understand this problem. My solution is a workaround and assumes filename is UTF-8 or ASCII. I would like to start discussion on this and work from there to more general solution.</div><div><br></div><div>dd</div><div><br><div><div>On Jun 25, 2012, at 7:27 PM, Daniel Alejandro Benavides D. wrote:</div><br class="ecxyiv395665588Apple-interchange-newline"><blockquote><span class="ecxyiv395665588Apple-style-span" style="text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; white-space: normal; border-collapse: separate; orphans: 2; widows: 2;"><table border="0" cellSpacing="0" cellPadding="0"><tbody><tr><td vAlign="top" style="font: inherit; font-size-adjust: inherit; font-stretch: inherit;">Hi all:<br>I as I understood, thought you don't want to talk about compatible W 95 / NT distro of Modula-3.<br>But in turn you want to keep compatibility with older file name encodes.<br>I don't care that but if its useful anyway (because newer windows don't care at all either) I don't know know your problem was because it won't be able to be solved!<br>Thanks in advance</td></tr></tbody></table></span></blockquote></div></div></div></div></blockquote></td></tr></tbody></table></blockquote></div></div></div></div></blockquote></td></tr></tbody></table></blockquote></div></div></div></div></div></div></blockquote></div></div></div></div></div></span></blockquote></div><br></div> </div></body>
</html>