<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
FONT-SIZE: 10pt;
FONT-FAMILY:Tahoma
}
</style>
</head>
<body class='hmmessage'>some vagaries of Win32 path semantics (and some Mac) <BR>
<BR>You can learn about this stuff by looking at the NT namespace with winobj.<BR>And/or watching calls to NtCreateFile in a debugger.<BR>And/or with filemon.<BR>And/or reading the documentation for driver writers.<BR>And/or various documentation about the NT kernel interface.<BR>And/or experimenting with cmd (assuming cmd isn't doing the wierd stuff).<BR>
<BR>This is about kernel32.dll for now.<BR>
<BR>When using 8 bit characters, the length limit is 260 characters.<BR>Whether or not that includes the terminal zero is not clear.<BR>
<BR>When using 16 bit characters, the "default" limit is also 260 characters.<BR>
<BR> MS-DOS limit is 64 or maybe 128 characters, so this is progress. (!) <BR>
Unix limit I think is usually around 1024. That's still pretty lame imho,<BR> just a little less lame. The actual Windows limit is 32K which seems pretty ok to me,<BR> though 16bit limits are surprising.<BR>
<BR>The limit on INDIVIDUAL PATH ELEMENTS, as dictated by FindFirstFile/FindNextFile<BR>is also 260 characters. But again, for individual path elements.<BR>I haven't tried exceeding that. The creation paths don't clearly have<BR>this limit. It is worth experimenting with.<BR>
<BR>Ignoring Windows 9x, everything is built on top of the NT kernel.<BR>Most interesting here is NtCreateFile.<BR>
<BR>At the kernel level, "relative opens" are allowed.<BR>All the various "name" or "path" based functions don't just take a string,<BR>they take an OBJECT_ATTRIBUTES.<BR>This is mainly flags, an optional parent handle, and a unicode string.<BR>The length of the unicode string is stored in an unsigned short<BR>representing a number of bytes. So the limit is around 32,767.<BR>One of the flags controls case sensitivity, for example, at least somewhat.<BR>I don't know what happens if you try to be case sensitive on FAT, for example.<BR>
<BR>NTFS allows volumes to be mounted in empty directories.<BR>I assume such a volume could be FAT, so even if c:\ is NTFS and capable<BR>of case sensitivity, c:\foo might not be.<BR>
<BR>If you are doing a non-relative open at the NT level, there is no working directory<BR>or relative paths or such. There is basically just full paths.<BR>They don't look quite exactly like anything else.<BR>They look like<BR> \??\c:\windows\system32\kernel32.dll <BR>or<BR> \??\unc\machine\share\foo\bar.txt <BR>
<BR>\?? before around NT4 was named \DosDevices.<BR>I suspect \?? was an optimization -- making the common case a shorter string.<BR>Seems lame but oh well.<BR>You can see in driver stuff about setting up symbolic links related to \DosDevices.<BR>
<BR>Yes, NT has symbolic links, in the kernel namespace.<BR>
<BR>At the kernel32.dll level, the documentation clearly exposes something very related.<BR>To open paths longer than 260 characters, they say to use the prefixes:<BR> \\?<BR> or <A href="file://\\?\unc">\\?\unc</A><BR>
<BR>Last I checked, the documentation was a little unclear.<BR>What they mean is, to form paths like:<BR>
<BR> <A href="file://\\?\c:\windows\system32\kernel32.dll">\\?\c:\windows\system32\kernel32.dll</A><BR> <A href="file://\\?\unc\machine\share\foo\bar.txt">\\?\unc\machine\share\foo\bar.txt</A> <BR>
<BR>They don't say so, but quick attempts otherwise clearly show, that<BR>the paths must be full paths. No relative to any "current working directory".<BR>
<BR>An implementation trick should be evident.<BR>Just change the second character from \ to ? and you get an NT path.<BR>
<BR>The documentation says \\? "turns off path parsing".<BR>
<BR>Usually all file paths undergo some amount of canonicalization.<BR>Forward slashes are changed to backward slashes.<BR>Runs of backward slashes are changed to a single slash.<BR>Spaces might be removed in some places?<BR>Trailing dots also?<BR>That is what \\? "turns off".<BR>
<BR>You can see this if you make some CreateFile calls and watch the resulting NtCreateFile.<BR>I should put together a demo. Using some hack to intercept the NtCreateFile call.<BR>
<BR>Nearly everything is demoable from the command line.<BR>Try this:<BR>
<BR> C:\> mkdir "foo" <BR> C:\> mkdir "foo " <BR> A subdirectory or file foo already exists. <BR>
<BR> Huh? <BR>
<BR> C:\>mkdir foo.<BR> A subdirectory or file foo. already exists. <BR>
<BR> Huh? <BR>
<BR> C:\>mkdir " foo"<BR>
<BR>ok.<BR>
<BR>C:\>mkdir "<A href="file://\\?\c:\foo">\\?\c:\foo</A>"<BR>C:\>mkdir "<A href="file://\\?\c:\foo">\\?\c:\foo</A> "<BR>
=> works <BR>
rmdir "<A href="file://\\?\c:\foo">\\?\c:\foo</A> "<BR>
<BR>C:\>mkdir "<A href="file://\\?\c:\foo">\\?\c:\foo</A>."<BR>
<BR>C:\>dir fo*<BR>
<BR> Volume in drive C has no label.<BR> Volume Serial Number is A803-BC73<BR>
<BR> Directory of C:\<BR>
<BR>02/21/2008 11:07 PM <DIR> foo<BR>02/21/2008 11:07 PM <DIR> foo.<BR> 0 File(s) 0 bytes<BR> 2 Dir(s) 26,666,397,696 bytes free<BR>
<BR>C:\>rmdir foo.<BR>The system cannot find the file specified.<BR>
<BR>Huh?<BR>
<BR>C:\>dir fo*<BR> Volume in drive C has no label.<BR> Volume Serial Number is A803-BC73<BR>
Directory of C:\<BR>
02/21/2008 11:07 PM <DIR> foo<BR>02/21/2008 11:07 PM <DIR> foo.<BR> 0 File(s) 0 bytes<BR> 2 Dir(s) 26,666,397,696 bytes free<BR>
C:\>mkdir bar. <BR> C:\>rmdir bar. <BR>
C:\>mkdir bar. <BR> C:\>rmdir bar <BR>
huh?<BR>
mkdir "foo \bar" <BR> dir foo<tab><BR> expands to "foo " because tab found it<BR> but then enter<BR>and<BR>
File Not Found <BR>
Huh?<BR>
<BR>
C:\>mkdir "<A href="file://\\?\c:\foo/">\\?\c:\foo/</A>" <BR> The filename, directory name, or volume label syntax is incorrect. <BR> => forward slash not liked<BR>
<BR> C:\>mkdir "c:\foo/" <BR> => no error <BR>
<BR> C:\>mkdir "<A href="file://\\?\c:\foo\..\bar">\\?\c:\foo\..\bar</A>"<BR> The filename, directory name, or volume label syntax is incorrect. <BR> => .. apparently not liked<BR> I tried . and that did work. <BR>
<BR>
C:\>mkdir c:\foo\..\bar <BR> => ok <BR>
<BR> So now I ask -- what is the portable interface and implementation? <BR>
<BR> Most code uses CreateFile and doesn't use \\?. <BR> So most code is limited to MAX_PATH and has problems with spaces and dots in some places. <BR>
<BR> These features are all laudable -- allow paths with more than 260 characters and with spaces<BR> and dots in more places, if the programmer or user really wants, but this \\? vs. \\? behavior is strange.<BR>
<BR>
Trailing spaces in paths tend to be "invisible" in any user interface.<BR>
(I wonder about tabs too, vertical space, beep, etc.)<BR>
<BR> Note that not everything goes through Win32 usermode CreateFile/kernel32.dll. <BR> It is not the one and only path to the file system, but it is overwhelmingly common.<BR> Imagine going over the network from a non-Windows client (e.g. Samba) <BR> And maybe Services for Unix.<BR>
<BR> shlwapi.dll and even shell32.dll also have a bunch of path/file unitility functions.<BR> I've hardly ever used them. I think I tried forward slashes with shlwapi.dll once and no go.<BR>
<BR>In an ntsd/cdb/windbg, try a breakpoint like:<BR>
bp ntdll!NtCreateFile "!obja poi(@esp+c);g" <BR>
<BR>That will trace the paths to NtCreateFile. It is a low tech filemon.<BR>esp is the stack pointer and the object attributes is the third parameter (4*3=c),<BR>and !obja prints object attributes.<BR>
<BR>I got bored writing this and maybe didn't finish covering everything, sorry.<BR>The thing to do is experiment and see what all changes between CreateFile and NtCreateFile.<BR>e.g. paths relative to the current working directory.<BR>
<BR>Oh, also, there is a relative working directory per volume on Windows.<BR>There are special environment variables used to store them.<BR>Something like the variable =c: has c:'s working directory.<BR>
<BR>
C:\>echo %=c:% <BR> C:\ <BR>
C:\>cd foo <BR> <BR>
C:\foo>echo %=c:% <BR> C:\foo <BR>
<BR> I only have one volume, so let's make another:<BR>
<BR> C:\foo>subst d: c:\<BR>
d:<BR> cd bar <BR>
D:\bar>echo %=d:% <BR> D:\bar <BR> <BR> <BR> c:<BR> brings me back to c:\foo <BR> cd d:\windows <BR> changes the working directory of D:, but doesn't bring me there<BR> d:<BR> now I am on the d: drive<BR>
<BR> You can use /d to cd to change drive and directory at the same time.<BR>
<BR>Cygwin also uses NtCreateFile sometimes. I haven't yet looked at why.<BR>
<BR> NTFS has hardlinks for files, not directories (avoid cycles and I guess adequate for strict Posix?). <BR> NTFS on Vista has symlinks I guess for files and directories, not sure. <BR> Windows 2000 added a CreateHardLink function. <BR> XP has "fsutil hardlink create newlink existingfile" <BR> Vista has that and "mklink". <BR>
<BR>Also, on my Mac I can create filenames with forward slashes in them.<BR>I have some old files from <A href="http://www.apple.com">www.apple.com</A> named like "C/C++ compiler reference".<BR>At the command line I think the forward slashes show as colons.<BR>Historically on the Mac the colon is path separator, but the "syntax"<BR>is different than Posix and Windows. Pathname.i3 documents it.<BR>
<BR>If you read the Mac OS X overviews, you can see Mac OS X is a tremendous mish-mash<BR>of similar redundant interfaces and implementations.<BR>There are many sets of functions for doing the same thing.<BR>There are older functions for example that take 8 bit characters, with limits of<BR>255, though usually I believe you do directory relative opens.<BR>
There are newer APIs that use Unicode and are more opaque.<BR>
<BR>You can have two volumes with the same name<BR>so:<BR> :hard disk:foo:bar <BR>
<BR>
(at least on Mac OS Classic, not sure about X)<BR>can actually name any number of files on Mac, depending on how many disks are named "hard disk".<BR>
<BR>I think what you have to do is enumerate volumes, get their "reference numbers" and do relative<BR>opens from there.<BR>
<BR>so much for string equality meaning two paths reference the same file.<BR>Now, granted, if you do two open calls with the same path, I assume it always gets the same one.<BR>However, what if in the meantime, more volumes come online?<BR>Maybe earlier mounted is earlier opened and consistent?<BR>
<BR>
Big mess.<BR>
<BR>(GS/OS on the long dead Apple IIGS also has this forward slash or colon behavior,<BR>and I think instead of one current working directory, you could assign a bunch of<BR>them to numbers. I think file names couldn't start with numbers, or maybe that was ambiguous.)<BR>
<BR> - Jay<BR><BR><br /><hr />Climb to the top of the charts! Play the word scramble challenge with star power. <a href='http://club.live.com/star_shuffle.aspx?icid=starshuffle_wlmailtextlink_jan' target='_new'>Play now!</a></body>
</html>