[M3devel] FW: more path stuff, sorry

Tony Hosking hosking at cs.purdue.edu
Wed Apr 23 15:06:12 CEST 2008


Jay, Daniel was referring to PM3, not CM3.

Antony Hosking | Associate Professor | Computer Science | Purdue  
University
305 N. University Street | West Lafayette | IN 47907 | USA
Office +1 765 494 6001 | Mobile +1 765 427 5484



On Apr 23, 2008, at 12:44 AM, Jay wrote:

> Daniel, this should address SOME but not necessarily all of your  
> concern and questions:
>
> The underlying libraries are mostly Cygwin.
> Underlying THEM is another library, but the way it works, there is  
> approximately zero "break through".
> Except for threads.
> Also, I intend, somehow, to expose FilePosix.T and FileWin32.T,  
> besides the regular File.T.
> This would be part of enabling the serial package to build, using  
> strictly "break through" and the Win32 code.
> However FilePosix.m3 and FileWin32.m3 both reveal File.T. I need to  
> split the revelation out into a separate interface/module I think,  
> shouldn't be hard.
>
> >From SOME point of view, forward slashes are perfectly acceptable  
> in Win32 paths and have the same meaning as backward slashs. It is  
> not 100%, but largely. This isn't Modula-3 code doing any  
> "translation" but the Modula-3 does treat  \ and / equivalent upon  
> read.
>
> The lowest level Win32 File functions in kernel32.dll -- CreateFile,  
> DeleteFile, MoveFile, CreateDirectory, RemoveDirectory, CopyFile,  
> GetCurrentDirectory, SetCurrentDirectory, GetFileAttributes[Ex],  
> FindFirst[Next]File[Ex], what else am I forgetting? Maybe  
> CreateProcess? GetFullPathName, GetShortPathName, usually consider \  
> and / the same. Unless the paths starts \\?. The / are all converted  
> to \ before getting to the underlying system, the NtCreateFile,  
> where strictly \ is the separator and there is no working directory.  
> If a path starts \\?, it is passed on directly. You don't have to  
> pass full paths. You can pass a handle to a parent directory.
>
> Now, I recently read, regarding NTFS-3G for Linux, about allowing :  
> and \ in file names.
> The claim is that if you install SFU (Services For Unix, previously  
> Interix, now SUA), you can access such names on NT.
> I tried it out. In fact this claim seems wrong. I create those  
> paths, and then enumerated and printed from Win32. Their lower 8  
> bits were : and \. But their upper 8 bits were 0xF0 or 0xFF or  
> somesuch. SFU has some ability to access unicode file names, but not  
> that I could find that would reveal this trickery. In particular,  
> there is like wcs_opendir that opens a directory with a unicode  
> path, but there is only the asii/ansi/utf8/7bit/8bit/whatever  
> readdir/readdir_r. So the 0xF000 cannot show through.
> And proper unicode Win32 code still doesn't see ':' or '\' per se in  
> file names (in paths, but not individual names).
>
> I think I happened to accidentally copy this stuff to a Linux  
> machine, and they didn't come across as ':' or '\'.
> Maybe I used wine cmd copy though. I should try with NTFS-3G. I do  
> suspect I'll get ':' and '\'.
>
> Speaking of portability -- should Modula-3 on Posix systems allow  
> file names with ':' and '\'? There are pluses and minuses. It's a  
> rhetorical question -- in that, I don't really want to discuss it  
> that much. I don't think there's a good answer, so I'd just as soon  
> leave the code alone vs. trying in vain to come up with what is  
> "correct".
>
> Yes there are Cygwin specific functions for converting paths.
> It would be reasonable for some NT386GNU-specific paths to use them.
> You can write up the *.i3 files and call them where you deem  
> appropriate.
>   Where deemed appropriate is not clear.
> It would not be great for the NT386 tools to use them -- to link to  
> cygwin1.dll, for any reason.
> Maybe via LoadLibrary/GetProcAddress -- an optional dependency. Maybe.
> <I wrote up a bunch of stuff about licensing, and deleted it all.>
> I believe "linkage" should mean "static linkage in the same file"  
> and not "dynamic linkage in the same process" but it isn't up to me  
> and it very well might be this way. I think the Linux kernel grants  
> an exception, but I'm skeptical that non-GPL code doesn't  
> dynamically link to in-proc GPL code that hasn't granted an  
> exception. Anyway, Modula-3 except for NT386GNU is not in any  
> uniquely worrying boat, for sure.
>
> You see, like my little speech about emulation layers and "break  
> though", Cygwin is an emulation layer.
> But its users may or may not consider it to have much "break through".
> It's users might be very accustomed to Posix paths and a full set of  
> Posix-path-using tools, and have no need to give any of them a Win32  
> path, and if they do, might be prepared to stop and think about it  
> and do their own conversion.
>
> I think it's gray, because gcc as a compiler is arguably way more  
> valuable than a Posix porting layer for other code.
> Or really, they are both useful. Some people will want one, or the  
> other, or both, or neither.
>
> And again I am confusing Cygwin, NT386GNU, and the backend vs. the  
> runtime.
>
> You can very well combine the integrated backend with a Posixish  
> runtime -- remember all the systems now have the integrated backend,  
> you should be able to output NT386 .obj files on any platform just  
> by saying TARGET= "NT386" in cm3.cfg.
>
> And you can combine the gcc backend with a Win32 runtime -- that is  
> what NT386MINGNU is.
>
> Also, while kernel32.dll treats / like \, other Win32 functions/ 
> libraries do not, such as comdlg32.dll (File.Open, File.Save), and  
> shlwapi.dll, I think. You'd have to read the docs and/or experiment.
> Win32 has a lot of nice internal consistencies, but it isn't 100%  
> consistent. Probably nothing is.
>
> A mix of slashes works fine on NT386.
>
> Does some of this make sense?
>
> Again again again, "NT386GNU" is different things to different  
> people, and some of them are in fact independent and you can pick  
> and chose which parts you compose. There is the runtime the compiler  
> uses. There is the backend. There is the library the output uses,  
> both the "C library" (fopen), the threading library (pthreads vs.  
> Win32), the windowing library (X Windows vs. Win32), the naming  
> conventions (libfoo.a vs. foo.lib). Nearly every one of these  
> factors is independent of the others, The config files are written  
> as such and you should be able to experiment and make your own other  
> combinations. Three such combinations are provided, and there's sort  
> of a tendency for the host and the target to be the same, but they  
> don't have to be. There are other factors, like which C compiler do  
> you use, which linker. The compiler, backend, and linker conspire to  
> determine which debugger you can use. If you se the integrated  
> backend and MS linker, you can use MS debuggers. If you use the gcc  
> backend and GNU linker, you can use gdb. Oh, well you can always use  
> either debugger, but the point here is if you have symbols or not.  
> Currently there is some problem mixing the gcc backend with the MS  
> linker and/or the integrated backend with the GNU linker. I think  
> gcc backend with MS linker, and only sometimes the other. There is a  
> symbol alway injected into the .o files that the GNU linker knows  
> about but that the MS linker gives as unresolved. The MS linker has  
> a different name for it, but I don't believe it is always injected  
> by the integrated backend. This is the symbol that lets you point to  
> the start of your image, it is __ImageBase in MS. It isn't used  
> much. But for some reason gcc always generates references to it, and  
> gives it a different name. Therefore some of the supposedly  
> independent factors are not independent as they should be.
>
> For network paths, Cygwin allows //machine/server.
>
> Current Cygwin issues warnings when you use Win32 paths, and the  
> warning says you can quash it by setting CYGWIN=nodospathwarning or  
> somesuch.
>
> Does this make some sense?
>
> At some point maybe I'll put together distributions for other  
> combinations.
>
> Oh, one more thing, the gcc backend can use __int64.
> I was feeling guilty about taking so long adding that to the  
> integrated backend, thus I fast-forwarded and got NT386GNU to work.  
> I should still go back and add it though.
>
> Also, currently the compiler assumes some linkage of these factors.  
> It assumes OS_TYPE=POSIX means targeting Cygwin. Eh that's pretty  
> much correct. At some point maybe UWin and SFU will be options, but  
> not right now. The compiler's main concern is the jumpbuf size,  
> which is much larger for Cygwin. You can to some extent maybe link  
> NT386 and NT386GNU together, but this jumpbuf thing could really  
> hurt in a hard to diagnose way. (Maybe it's safe due to setjmp vs.  
> _setjmp?)?
>
> Sorry this is too long. As Olaf said, what I say might be very true  
> and an accurate model, but it confuses everyone.
> People want very very few variables.
>
> Sorry I have to run, no proofreading this "masterpiece". <Some self  
> deprecating humor inserted here.>
>
>
>  - Jay
>
> Date: Wed, 23 Apr 2008 06:00:43 +0200
> From: dabenavidesd at yahoo.es
> To: m3devel at elegosoft.com
> Subject: Re: [M3devel] FW: more path stuff, sorry
>
> Hi all:
> Does pm3 treat NT386GNU paths as POSIX class? I think if one take  
> the definition the Interface Pathname, NT386GNU target should not  
> have access directly to use POSIX paths but Win32 ones because of:
> "A Pathname.T (or just a pathname) is a text conforming to the  
> syntax of the underlying operating system" in
> http://www.opencm3.net/doc/help/gen_html/libm3/src/os/Common/Pathname.i3.html
> But if we think in "the underlying operating system" as the cygwin  
> api, it should be the natural POSIX paths.
> I agree Pathname is a common interface, but what happens with that  
> sense of portability we want? Maybe the answer is in the cygwin api;  
> the non-standard functions of converting one style to the other: http://cygwin.com/cygwin-api/cygwin-functions.html 
>  .
> See http://www.cygwin.com/cygwin-ug-net/using.html#using-pathnames
> for the main notes of compatibility. It seems clear that cygwin can  
> use both kind of path styles, so it's not a problem to use just one  
> of POSIX or Win32 paths. So defining a Pathname of type POSIX style  
> for NT386GNU, it is  not resign to the more capable functions of  
> cygwin?
> I mean if Pathname accept both kind of forward and back slashes in  
> different Pathnames it should be defined a new kind of Pathname  
> syntax. Oh I remember saw the mix of both kind of slashes in pm3  
> quake/config files of NT386GNU :) Is this really confusing or  
> unthinkable?
> I think this is clear for some of yous (NT386GNU should just accept  
> POSIX paths, not even a Win32 one, and obviously not a mix of the  
> two syntax in a same path), but it is not for me.
> In my opinion both styles should be acceptable without defining a  
> new kind of path syntax, and convert only using the cygwin api  
> functions. For instance when using netbios resources but also trying  
> to mount disks partitions on the cygwin mount table file we need  
> both kind of syntax. Is that even logical?
> Thanks in advance.
>
> Tony Hosking <hosking at cs.purdue.edu> escribió:
> Jay, there is an underlying principle here that you seem to be  
> missing so I will make it explicit.  Cross-product systems tend to  
> acquire unmanageable complexity, especially when it comes to  
> testing.  By making each target one personality we are able to test  
> machine-dependent code in isolation from machine-independent code.   
> Often, when something breaks on one target I will test on another  
> just to isolate the problem.  This is a very powerful approach and I  
> am very leery of destroying any ability to do this -- it allows us  
> to maintain the high-level portability of most Modula-3 code while  
> isolating the small fraction of machine-/target-dependent stuff.
>
>
>
> On Apr 22, 2008, at 12:28 AM, Jay wrote:
>
> [truncated]...
>
>
>
>
> From: jayk123 at hotmail.com
> To: rcoleburn at scires.com; m3devel at elegosoft.com
> Subject: RE: [M3devel] more path stuff, sorry
> Date: Tue, 22 Apr 2008 04:19:52 +0000
>
> There's no dependency, no linkage. Just a few simple string  
> operations.
> I'll probably remove it tonight.
>
> Modula-3 has a split personality no matter what, in that it calls  
> into very varying underlying layers, often trafficing in their  
> specific data formats.
> It's just that it can strive to aid portability between them or not,  
> by accepting either input and massaging it to work, vs. passing it  
> along "unchanged" (well, that's not what happens actually). If there  
> were no ambiguous cases and the Posix systems on Windows were  
> consistent in their conventions, I'd be more for it. But the  
> ambiguity and varying Posix conventions weaken the case tremendously.
>
>  - Jay
>
> Date: Mon, 21 Apr 2008 23:14:00 -0400
> From: rcoleburn at scires.com
> To: m3devel at elegosoft.com
> Subject: Re: [M3devel] more path stuff, sorry
>
> I concur wholeheartedly.  I absolutely DO NOT want native NT386 to  
> have any knowledge of or dependency on Cygwin.
> Regards,
> Randy
>
> >>> Tony Hosking <hosking at cs.purdue.edu> 4/21/2008 9:44 PM >>>
> Why would native NT386 know anything at all about Cygwin.  I say  
> just avoid split personalities like the plague.  Similarly, I'd be  
> happy for NT386GNU (i.e., Cygwin?) to simply behave like a POSIX  
> build (modulo native threads perhaps).
>
> On Apr 21, 2008, at 7:15 PM, Jay wrote:
>
> Maybe this is dubious.
>
> The question is, like, should native NT386 cm3 accept /cygdrive/c/ 
> foo and translate to c:\foo?
> Or trickier, /usr/bin/foo and translate to c:\cygwin\usr\bin\foo?
> And vice versa, should NT386GNU accept c:\foo?
>
> ...
>
>
>
> Enviado desde Correo Yahoo!
> La bandeja de entrada más inteligente.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20080423/c2d7c30e/attachment-0002.html>


More information about the M3devel mailing list