[M3devel] FW: more path stuff, sorry

Jay jayk123 at hotmail.com
Wed Apr 23 07:17:43 CEST 2008


 > Types are important. I think much more so than modules.
But user input, esp. in the form of a command line, are ambiguous and not strongly typed, and therefore sometimes open to "sniffing", to loose interpretation, guessing what was intended. There IS a place for guessing, but not everywhere.
 
For example maybe?: 
 cp foo bar 
 
What does that do? Copy one file or a directory full of files? 
 
A really bad example, what does this command line do:
 
c:\program files\microsoft visual studio\cl.exe
 
Does it run "cl.exe" in the directory "c:\program files\microsoft visual studio".
Or does it run the "program" in c:\ with parameters "files\microsoft", "visual", "studio\cl.exe"?
 
I BELIEVE the answer is that it DEPENDS.
 
Both can exist at the same time -- "c:\program" and "C:\program files"
 
I BELIEVE if "c:\program" exists, it is run.
I'm not sure.
 
CreateProcess has two parameters here and so there is a way to disambiguate and execve is probably unambiguous, but system() is ambiguous and CreateProcess has an ambiguous mode since you don't have to use both parameters -- like path to .exe and command line, you can leave the first NULL, something like that it's considered poor practise.
 
Command lines are awfully weakly typed.
You know -- you need to hit those keys harder. :)
 
Or watch this, sometimes you don't need spaces between command line parameters:
 
D:\>dir cd foo => nothing 
 
D:\>mkdir foo
D:\>cd\foo
 
=> changes into the foo directory.
 
let's go back up:
cd \
 
and now...
enable running extensionless files:
 
set PATHEXT=.;%PATHEXT%
 
create one..
 mkdir \cd 
 copy %windir%\notepad.exe \cd\foo 
 
and..:
 
cd\foo
 
now cd\foo runs the copy of notepad.
 
ambiguity abounds...
 
 
 - Jay


From: jayk123 at hotmail.comTo: dabenavidesd at yahoo.es; m3devel at elegosoft.comDate: Wed, 23 Apr 2008 04:59:51 +0000Subject: Re: [M3devel] FW: more path stuff, sorry


I forgot some points.Daniel, in order to convert paths between the two, you need context to know what type the provider of the path intended. There are paths that are valid in both systems but that have different means. The path /usr/bin in Win32 is the same as \usr\bin and on the drive of the current working directory (the working, if you will)The path /usr/bin in Cygwin is \usr\bin under the Cygwin root, so often like c:\cygwin\usr\bin. The path /cygdrive/c/foo is also valid in Win32, and has a different meaning there than in Cygwin. Get it? The path c:/foo is unambiguous. The "string" c:/foo is ambiguous, depending on point of view.Let's say I want a string that contains colon delimited paths.Then this is the path c and the path /foo.But if paths themselves can have colons then it is also the single element list with the element c:/foo. You could get fancy shmancy and escape the colon c\:/foo but that's insane.Quoting is already a big problem in existing systems and usage, and I just made that garbage up. One of the points maybe the Modula-3 folks are alluding is that Pathname.T is a TYPE with an INTERFACE.  It is not a "string".In many other systems a "path" is "just" a "string", it just gets pass around, and both sides party on it and hope they give it the same meaning.If your notion of type is not clear, between Win32 and Posix, then neither is the meaning.Strings are a terribly loose overused type....This does save people from coming up with those fancy complicated difficult to learn INTERFACES, and it aids interoperability, since everyone can use a string...it's good and it's bad, the glass is half full and half empty... c:\ "looks" like a win32 path, but hey, be careful there, I'm just smiling big and wearing a crooked hat, you have interpreted it completely incorretly, eh? Types are important. I think much more so than modules. And /foo/ looks like a Posix path, but actually it's italicized text. :)Or then again, this whole email is a valid Posix path, newlines and all, since it has no embedded nuls. :)  - Jay


From: jayk123 at hotmail.comTo: dabenavidesd at yahoo.es; m3devel at elegosoft.comSubject: RE: [M3devel] FW: more path stuff, sorryDate: Wed, 23 Apr 2008 04:44:23 +0000

Daniel, this should address SOME but not necessarily all of your concern and questions: The underlying libraries are mostly Cygwin.Underlying THEM is another library, but the way it works, there is approximately zero "break through".Except for threads.Also, I intend, somehow, to expose FilePosix.T and FileWin32.T, besides the regular File.T.This would be part of enabling the serial package to build, using strictly "break through" and the Win32 code.However FilePosix.m3 and FileWin32.m3 both reveal File.T. I need to split the revelation out into a separate interface/module I think, shouldn't be hard. From SOME point of view, forward slashes are perfectly acceptable in Win32 paths and have the same meaning as backward slashs. It is not 100%, but largely. This isn't Modula-3 code doing any "translation" but the Modula-3 does treat  \ and / equivalent upon read. The lowest level Win32 File functions in kernel32.dll -- CreateFile, DeleteFile, MoveFile, CreateDirectory, RemoveDirectory, CopyFile, GetCurrentDirectory, SetCurrentDirectory, GetFileAttributes[Ex], FindFirst[Next]File[Ex], what else am I forgetting? Maybe CreateProcess? GetFullPathName, GetShortPathName, usually consider \ and / the same. Unless the paths starts \\?. The / are all converted to \ before getting to the underlying system, the NtCreateFile, where strictly \ is the separator and there is no working directory. If a path starts \\?, it is passed on directly. You don't have to pass full paths. You can pass a handle to a parent directory. Now, I recently read, regarding NTFS-3G for Linux, about allowing : and \ in file names.The claim is that if you install SFU (Services For Unix, previously Interix, now SUA), you can access such names on NT.I tried it out. In fact this claim seems wrong. I create those paths, and then enumerated and printed from Win32. Their lower 8 bits were : and \. But their upper 8 bits were 0xF0 or 0xFF or somesuch. SFU has some ability to access unicode file names, but not that I could find that would reveal this trickery. In particular, there is like wcs_opendir that opens a directory with a unicode path, but there is only the asii/ansi/utf8/7bit/8bit/whatever readdir/readdir_r. So the 0xF000 cannot show through.And proper unicode Win32 code still doesn't see ':' or '\' per se in file names (in paths, but not individual names). I think I happened to accidentally copy this stuff to a Linux machine, and they didn't come across as ':' or '\'.Maybe I used wine cmd copy though. I should try with NTFS-3G. I do suspect I'll get ':' and '\'. Speaking of portability -- should Modula-3 on Posix systems allow file names with ':' and '\'? There are pluses and minuses. It's a rhetorical question -- in that, I don't really want to discuss it that much. I don't think there's a good answer, so I'd just as soon leave the code alone vs. trying in vain to come up with what is "correct". Yes there are Cygwin specific functions for converting paths.It would be reasonable for some NT386GNU-specific paths to use them.You can write up the *.i3 files and call them where you deem appropriate.  Where deemed appropriate is not clear.It would not be great for the NT386 tools to use them -- to link to cygwin1.dll, for any reason.Maybe via LoadLibrary/GetProcAddress -- an optional dependency. Maybe.<I wrote up a bunch of stuff about licensing, and deleted it all.>I believe "linkage" should mean "static linkage in the same file" and not "dynamic linkage in the same process" but it isn't up to me and it very well might be this way. I think the Linux kernel grants an exception, but I'm skeptical that non-GPL code doesn't dynamically link to in-proc GPL code that hasn't granted an exception. Anyway, Modula-3 except for NT386GNU is not in any uniquely worrying boat, for sure. You see, like my little speech about emulation layers and "break though", Cygwin is an emulation layer.But its users may or may not consider it to have much "break through".It's users might be very accustomed to Posix paths and a full set of Posix-path-using tools, and have no need to give any of them a Win32 path, and if they do, might be prepared to stop and think about it and do their own conversion. I think it's gray, because gcc as a compiler is arguably way more valuable than a Posix porting layer for other code.Or really, they are both useful. Some people will want one, or the other, or both, or neither. And again I am confusing Cygwin, NT386GNU, and the backend vs. the runtime. You can very well combine the integrated backend with a Posixish runtime -- remember all the systems now have the integrated backend, you should be able to output NT386 .obj files on any platform just by saying TARGET= "NT386" in cm3.cfg. And you can combine the gcc backend with a Win32 runtime -- that is what NT386MINGNU is. Also, while kernel32.dll treats / like \, other Win32 functions/libraries do not, such as comdlg32.dll (File.Open, File.Save), and shlwapi.dll, I think. You'd have to read the docs and/or experiment.Win32 has a lot of nice internal consistencies, but it isn't 100% consistent. Probably nothing is. A mix of slashes works fine on NT386. Does some of this make sense? Again again again, "NT386GNU" is different things to different people, and some of them are in fact independent and you can pick and chose which parts you compose. There is the runtime the compiler uses. There is the backend. There is the library the output uses, both the "C library" (fopen), the threading library (pthreads vs. Win32), the windowing library (X Windows vs. Win32), the naming conventions (libfoo.a vs. foo.lib). Nearly every one of these factors is independent of the others, The config files are written as such and you should be able to experiment and make your own other combinations. Three such combinations are provided, and there's sort of a tendency for the host and the target to be the same, but they don't have to be. There are other factors, like which C compiler do you use, which linker. The compiler, backend, and linker conspire to determine which debugger you can use. If you se the integrated backend and MS linker, you can use MS debuggers. If you use the gcc backend and GNU linker, you can use gdb. Oh, well you can always use either debugger, but the point here is if you have symbols or not. Currently there is some problem mixing the gcc backend with the MS linker and/or the integrated backend with the GNU linker. I think gcc backend with MS linker, and only sometimes the other. There is a symbol alway injected into the .o files that the GNU linker knows about but that the MS linker gives as unresolved. The MS linker has a different name for it, but I don't believe it is always injected by the integrated backend. This is the symbol that lets you point to the start of your image, it is __ImageBase in MS. It isn't used much. But for some reason gcc always generates references to it, and gives it a different name. Therefore some of the supposedly independent factors are not independent as they should be. For network paths, Cygwin allows //machine/server. Current Cygwin issues warnings when you use Win32 paths, and the warning says you can quash it by setting CYGWIN=nodospathwarning or somesuch. Does this make some sense? At some point maybe I'll put together distributions for other combinations. Oh, one more thing, the gcc backend can use __int64.I was feeling guilty about taking so long adding that to the integrated backend, thus I fast-forwarded and got NT386GNU to work. I should still go back and add it though. Also, currently the compiler assumes some linkage of these factors. It assumes OS_TYPE=POSIX means targeting Cygwin. Eh that's pretty much correct. At some point maybe UWin and SFU will be options, but not right now. The compiler's main concern is the jumpbuf size, which is much larger for Cygwin. You can to some extent maybe link NT386 and NT386GNU together, but this jumpbuf thing could really hurt in a hard to diagnose way. (Maybe it's safe due to setjmp vs. _setjmp?)? Sorry this is too long. As Olaf said, what I say might be very true and an accurate model, but it confuses everyone.People want very very few variables. Sorry I have to run, no proofreading this "masterpiece". <Some self deprecating humor inserted here.>  - Jay


Date: Wed, 23 Apr 2008 06:00:43 +0200From: dabenavidesd at yahoo.esTo: m3devel at elegosoft.comSubject: Re: [M3devel] FW: more path stuff, sorryHi all:Does pm3 treat NT386GNU paths as POSIX class? I think if one take the definition the Interface Pathname, NT386GNU target should not have access directly to use POSIX paths but Win32 ones because of:"A Pathname.T (or just a pathname) is a text conforming to the syntax of the underlying operating system" inhttp://www.opencm3.net/doc/help/gen_html/libm3/src/os/Common/Pathname.i3.htmlBut if we think in "the underlying operating system" as the cygwin api, it should be the natural POSIX paths.I agree Pathname is a common interface, but what happens with that sense of portability we want? Maybe the answer is in the cygwin api; the non-standard functions of converting one style to the other: http://cygwin.com/cygwin-api/cygwin-functions.html .See http://www.cygwin.com/cygwin-ug-net/using.html#using-pathnamesfor the main notes of compatibility. It seems clear that cygwin can use both kind of path styles, so it's not a problem to use just one of POSIX or Win32 paths. So defining a Pathname of type POSIX style for NT386GNU, it is  not resign to the more capable functions of cygwin?I mean if Pathname accept both kind of forward and back slashes in different Pathnames it should be defined a new kind of Pathname syntax. Oh I remember saw the mix of both kind of slashes in pm3 quake/config files of NT386GNU :) Is this really confusing or unthinkable?I think this is clear for some of yous (NT386GNU should just accept POSIX paths, not even a Win32 one, and obviously not a mix of the two syntax in a same path), but it is not for me.In my opinion both styles should be acceptable without defining a new kind of path syntax, and convert only using the cygwin api functions. For instance when using netbios resources but also trying to mount disks partitions on the cygwin mount table file we need both kind of syntax. Is that even logical?Thanks in advance.Tony Hosking <hosking at cs.purdue.edu> escribió: 



Jay, there is an underlying principle here that you seem to be missing so I will make it explicit.  Cross-product systems tend to acquire unmanageable complexity, especially when it comes to testing.  By making each target one personality we are able to test machine-dependent code in isolation from machine-independent code.  Often, when something breaks on one target I will test on another just to isolate the problem.  This is a very powerful approach and I am very leery of destroying any ability to do this -- it allows us to maintain the high-level portability of most Modula-3 code while isolating the small fraction of machine-/target-dependent stuff.



On Apr 22, 2008, at 12:28 AM, Jay wrote:

[truncated]...


From: jayk123 at hotmail.comTo: rcoleburn at scires.com; m3devel at elegosoft.comSubject: RE: [M3devel] more path stuff, sorryDate: Tue, 22 Apr 2008 04:19:52 +0000There's no dependency, no linkage. Just a few simple string operations.I'll probably remove it tonight. Modula-3 has a split personality no matter what, in that it calls into very varying underlying layers, often trafficing in their specific data formats.It's just that it can strive to aid portability between them or not, by accepting either input and massaging it to work, vs. passing it along "unchanged" (well, that's not what happens actually). If there were no ambiguous cases and the Posix systems on Windows were consistent in their conventions, I'd be more for it. But the ambiguity and varying Posix conventions weaken the case tremendously.  - Jay


Date: Mon, 21 Apr 2008 23:14:00 -0400From: rcoleburn at scires.comTo: m3devel at elegosoft.comSubject: Re: [M3devel] more path stuff, sorry
I concur wholeheartedly.  I absolutely DO NOT want native NT386 to have any knowledge of or dependency on Cygwin.
Regards,
Randy>>> Tony Hosking <hosking at cs.purdue.edu> 4/21/2008 9:44 PM >>>


Why would native NT386 know anything at all about Cygwin.  I say just avoid split personalities like the plague.  Similarly, I'd be happy for NT386GNU (i.e., Cygwin?) to simply behave like a POSIX build (modulo native threads perhaps).

On Apr 21, 2008, at 7:15 PM, Jay wrote:

Maybe this is dubious.The question is, like, should native NT386 cm3 accept /cygdrive/c/foo and translate to c:\foo?Or trickier, /usr/bin/foo and translate to c:\cygwin\usr\bin\foo?And vice versa, should NT386GNU accept c:\foo?...

Enviado desde Correo Yahoo!La bandeja de entrada más inteligente.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20080423/8ca5fd58/attachment-0002.html>


More information about the M3devel mailing list