tags:

views:

150

answers:

3

If I have two different file paths, how can I determine whether they point to the same file ?

This could be the case, if for instance the user has a network drive attached, which points to some network resource. For example drive S: mapped to \servercrm\SomeFolder.

Then these paths actually point to the same file:

S:\somefile.dat

And

\\servercrm\SomeFolder\somefile.dat

How can I detect this ? I need to code it so that it works in all scenarios where there might be different ways for a path to point to the same file.

+1  A: 

At the very least you could take and compare the MD5 hashes of the combined file contents, file name, and metadata such as CreationTime, LastAccessTime, and LastWriteTime.

Robert Venables
I thought of this, but generating the MD5 hash will take too long time for my application, since I need to read the file over the network.
driis
That would only tell you that they most likely have the same content, not that they are the *same* file. It could be a copy of the file, as one scenario that would fail with this.
Erich Mirabal
Erich: you are mostly correct. However, how many files do you have laying around on your hard drive with the same name, MD5 Hash, AND CreationTime, LastAccessTime, and LastWriteTime?
Robert Venables
+2  A: 

I don't know if there is an easy way to do this directly in C# but you could do an unmanaged call to GetFileInformationByHandle (pinvoke page here) which will return a BY_HANDLE_FILE_INFORMATION structure. This contains three fields which can be combined to uniquely ID a file:

dwVolumeSerialNumber: The serial number of the volume that contains a file.

...

nFileIndexHigh: The high-order part of a unique identifier that is associated with a file.

nFileIndexLo: The low-order part of a unique identifier that is associated with a file.

The identifier (low and high parts) and the volume serial number uniquely identify a file on a single computer. To determine whether two open handles represent the same file, combine the identifier and the volume serial number for each file and compare them.

Note though that this only works if both references are declared from the same machine.


Edited to add:

As per this question this may not work for the situation you have since the dwVolumeSerialNumber may be different is the share definitions are different. I'd try it out first though, since I always thought that the volume serial number was drive specific, not path specific. I've never needed to actually prove this though, so I could be (and probably am) wrong.

Martin Harris
Thanks, this seems like it could do what I need. I will look into that.
driis
A: 

If you're only worried about local files then you can use the combination of GetFileInformationByHandle and the BY_HANDLE_FILE_INFORMATION structure. Lucian did an excellent blog post on this subject here. The code is in VB.Net but it should be easily convertible to C#

JaredPar