views:

282

answers:

4

If I have a string that resolves to a file path in Windows, is there an accepted way to get a canonical form of the file name?

For example, I'd like to know whether

C:\stuff\things\etc\misc\whatever.txt

and

C:\stuff\things\etc\misc\other\..\whatever.txt

actually point to the same file or not, and store the canonical form of the path in my application.

Note that simple string comparisons won't work, nor will any RegEx magic. Remember that we have things like NTFS reparse points to deal with since Windows 2000 and the new Libraries structure in Windows 7.

+3  A: 

Using FileInfo (example in C#):

FileInfo info1 = new FileInfo(@"C:\stuff\things\etc\misc\whatever.txt");
FileInfo info2 = new FileInfo(@"C:\stuff\things\etc\misc\other\..\whatever.txt");
if (info1.FullName.Equals(info2.FullName)) {
    Console.WriteLine("yep, they're equal");
}
Console.WriteLine(info1.FullName);
Console.WriteLine(info2.FullName);

Output is:

yep, they're equal
C:\stuff\things\etc\misc\whatever.txt
C:\stuff\things\etc\misc\whatever.txt

jheddings
I'm not sure that Equals check will be useful -- as far as I can tell FileInfo doesn't override Equals so this will just give you reference equality, not file path equivalence. Thus, in your example, info1.Equals(info2) returns false.
itowlson
@itowlson: yeah, I noticed too that after I read the docs... I changed my answer and actually tested it this time :)
jheddings
It's not a complete solution, as @Roger Lipscombe pointed out in his answer. We might have to invoke the 80/20 rule on this one.
dthrasher
+1  A: 

jheddings has a nice answer, but since you didn't indicate which language you are using, I thought I'd give a Python way to do it that also works from the command line, using os.path.abspath:

> python -c "import os.path; print os.path.abspath('C:\stuff\things\etc\misc\other\..\whatever.txt')"
C:\stuff\things\etc\misc\whatever.txt
John Paulett
+3  A: 

Short answer: not really.

There is no simple way to get the canonical name of a file on Windows. Local files can be available via reparse points, via SUBST. Do you want to deal with NTFS junctions? Windows shortcuts? What about \\?\-escaped filenames

Remote files can be available via mapped drive letter or via UNC. Is that the UNC to the origin server? Are you using DFS? Is the server using reparse points, etc.? Is the server available by more than one name? What about the IP address? Does it have more than one IP address?

So, if you're looking for something like the inode number on Windows, it ain't there. See, for example, this page.

Roger Lipscombe
+1 for the link to the nice blog post 'PathCanonicalize Versus What It Says On The Tin'
sean e
Wow. What a mess. Why is nothing easy in Windowsland?
dthrasher
+1  A: 

Roger is correct, there is no simple way. If the volume supports file a unique file index, you can open the file and call GetFileInformationByHandle, but this will not work on all volumes.

The Windows API call GetFullPathName may be the best simple approach.

Stephen Nutt
+1... also the file index is only unique w.r.t. the volume, so you need the volume serial number as well, assuming that it's available remotely.
Roger Lipscombe
Good point of clarification, thanks Roger.
Stephen Nutt