[M3devel] more path stuff, sorry

Jay jayk123 at hotmail.com
Tue Apr 22 06:16:44 CEST 2008


Randy we are kind of saying the same thing.
Imagine if you will code out that that assumes all "full paths" start with a forward slash.
There is surely a lot of this code.
 
Imagine if you will code out there that assumes that all "full paths" start with either two slashes or a letter, colon, slash.
There is a lot of this code too.
(I think btw that Windows CE paths all start with one leading slash, and I suspect it must be a backward one, but I don't have a CE Phone yet, maybe soon, ARM_CE? :) ARM_WINCE?)
 
Now imagine that this code is being used in some build automation and feeding the paths it forms to cm3.
There is actually probably not much of this, but it is "reasonable".
 
The idea then would be for such code to be "portable" to Posix and/or Win32, without having to adopt "some abstraction".
 
You know...there's some tension in programming, between inventing abstractions and layers to bridge different underlying implementations, ahead of time or in all paths, vs. inserting new layers in some paths to make them "look like" preexisting paths. "Emulation layers" if you will. The advantage of "emulation" is that at least some variants run "native" and skip the "emulation". That is, like, survey the landscape of existing implementations, pick one that is reasonable and popular and easily emulated, write all your code to it, and add an emulation layer for the other cases. The trick of course is that it isn't always possible. Sometimes the job of the "abstraction layer" includes narrowing the underlying interfaces/feature set to the intersection, often refered to as "least common denominator". And then, some "abstraction layers" to "fix" this problem, let you "break through" to the underlying system. Such allowing of "break through" is obviously good and bad in multiple ways. The ability to break through takes away the ability of the layer to be stateful. Stateful layers are also to be avoided in the first place if performance is a concern.
 
Now, in reality, paths are not a great example of this, because paths don't have all that many "features", so an abstraction boundary isn't likely to be lossy.
 
Besides features, there issues of "capacity". For example, imagine I have a system that allows files larger than 4gig and a system that does not. Imagine the portability layer lets through file sizes larger than 4gig. While this enables the better system to have a larger capacity, it also enables a situation where users on one system can write files that users on another cannot read. Sometimes people suggest warnings for these kinds of things, like when you save a Word document in an older format and some formating may or may not be lost.
 
Another problem with abstraction boundaries is you that you inevitably create yet another way of doing things, in the service of papering over that there is already more than one way. The cure is similar to the disease, sort of.
You know, there is some set of programmers who can read XWindows code, some set that can read MSWindows, and the set that can read Trestle is bound to be less. Of course, this varies. Like, probably more people are familiar now with MFC, Qt, GTk than XOpenDisplay or CreateWindow. Sometimes the underlying systems are too hard to use, too unfeatureful and the layers built on them are not just for portability but also ease of use and sometimes add a bunch of value.
 
In the end, I think I'll remove the code. But I'm not crazy. Portability can be served by catering to existing practises rather than inventing new ones.
 
But granted it doesn't always work well. The Posix systems for Windows all seem to stink, and Wine seems to really stink.
 
Btw, interesting point I think is how cm3cg manages to not care -- its input and output are always in the current working directory. It doesn't create any threads, no gui. While the "build system", sh, nmake, sed, awk, might have heavier requirements on their runtime (I don't know, haven't looked, not sure if Cygwin was for them or the larger goal), m3cg is light. Gcc is somewhere in between since it does look up paths "around the system" and run sub processes. But it still shouldn't be much OS dependent code, vs. all the work of parsing C, codegen, optimization.
 
I must point out that the landscape is strewn with abstraction boundaries that work a lot and fail a lot.
File systems are a huge example here. Some file systems allow files over 2 or 4 gig, some do not. Some perform at the much greater speed of nearby mechanically spinning disks, some at the usually slow rate of a network, some write at the very very slow speed of flash, some (flash) cannot stand too many writes. Some have readonly/hidden/system/etc. bits, some do not. Sometimes people stash the extra data in one form or another, sometimes it tends to get lost (Mac resource fork), sometimes now. I vaguely recall that Mac OSX implements hardlinks on HFS+ in a pretty convoluted way.
 
File systems, thinking about paths, are also a big problem in terms of preservation of file names across file systems.
At one extreme is MS-DOS 8.3 and some 14 character Unix systems.
At another extreme is 255 character Unicode names.
Files created one system cannot necessarily be moved to another.
Yet the abstraction boundary to the file system, fopen, or whatever Modula-3 has, doesn't somehow deal with this. You know, you could do something whacky like put everything in a .zip file. But then you lose interoperability when the data would have been ok. Case sensitivity...how many folks think "Win32" is case sensitive and "Unix" is not? And yet, that is not how it works. It depends on the file system. How much code in the world is not prepared for symlinks or hard links? A lot.
 
We muddle along.
 
 - Jay


Date: Mon, 21 Apr 2008 23:09:39 -0400From: rcoleburn at scires.comTo: m3devel at elegosoft.comSubject: Re: [M3devel] more path stuff, sorry

I think I've made my opinion on the path issue known, but just so there is no doubt, I do not want the underlying code trying to translate paths from one format to another.  I think this is a recipe for problems since different OS represent paths and switches etc differently.  Programmers should use the standard interfaces to construct paths appropriate for the current host operating system.  Part of the beauty of Modula-3 is write once run everywhere.  I have code that uses pathnames that runs on multiple platforms without the need to make any source code modifications.
Regards,
Randy>>> Jay <jayk123 at hotmail.com> 4/21/2008 7:15 PM >>>Maybe this is dubious.The question is, like, should native NT386 cm3 accept /cygdrive/c/foo and translate to c:\foo?Or trickier, /usr/bin/foo and translate to c:\cygwin\usr\bin\foo?And vice versa, should NT386GNU accept c:\foo?The translation is not simple in general./ maps to the Cygwin install root.There can be symlinks.But many common cases can be handled with little, simple code.It's already in.Ultimately the way to do this correctly is to link to cygwin1.dll, which is only done sometimes, and which probably license-undesirable when not done.As well, a path like /foo is actually ambiguous.It is a valid native Win32 path, equivalent to \foo.Or it could be Cygwin path requivalent to c:\cygwin\foo.While it is nice to keep cm3 simple, it is also nice to have a more uniform interfaceacross hosts, I think. Maybe.To some extent a user can do this himself.That is, NTFS junctions and/or Cygwin symblinks can cause their to be identical pathswith identical meanings:e.g. for me, symlink /msdev => c:\msdev, /cm3 => c:\cm3 /dev => c:\dev2In some places Cygwin open et. al. accept Win32 paths, but lately taking advantage ofthat issues a warning. In some places, I think, it isn't sufficient to have Cygwin handle it.The harder question then is, if I feed Cygwin paths of a particular form, should ittry to report back paths in the same form?I put some code in M3Path.m3 "like this", but it may not be advisable.I also had an NTFS junction on my system so Win32 /cygdrive/c => c:\, but this causesa circularity in the file system, which I'd rather not have.As well, there are at least three or four Posix runtimes for Windows, and they each mapdifferently.UWin I think uses /c <=> c:\.Cygwin /cygdrive <=> c:\Interix/ServicesForUnix/SUA I think something via /dev.MinGWin also has a runtime that I think uses the UWin mapping, but this runtime is onlymeant for sh/gcc to use, not user apps.Having special code that only knows the Cygwin convention is questionable. I was often setting up some environment variables one way or the other, and thenrunning one cm3 or the other, without thinking about which form the variables were in.Indeed, it might be nice to set CM3_ROOT and CM3_INSTALL "once and for all",while still switching between different forms of cm3.exe?Though granted, CM3_INSTALL cm3.exe can usually figure out from its own path, andCM3_ROOT could usually be figured out by looking at the CVS directories in the currentworking directory, not always, but often. Basically, instead of setting these variblesone way or the other, I'd like to set them always one way, or even not at all.Hm, in fact, I think cm3 could figure out all overrides itself?You know, if there is a shipped package store, it can use that to determine the source <=> pkg mapping.And it can figure out CM3_ROOT by looking at the current directory and on up until the CVS/Rootchanges? As well, you know, the source <=> pkg mapping could be a simple generated checked in file.You know, like the PKGS file, but stripped down to only what is needed? Maybe just remove the code I put in and forget the whole thing.Maybe most people only ever stay in one of the worlds and "translation" isn't important?Just me stuck with providing support for both and therefore living with both more?   - Jay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20080422/577b1bce/attachment-0002.html>


More information about the M3devel mailing list