views:

1027

answers:

4

Is there are way to uniquely identify a file (and possibly directories) for the lifetime of the file regardless of moves, renames and content modifications? (Windows 2000 and later). Making a copy of a file should give the copy it's own unique identifier.

My application associates various meta-data with individual files. If files are modified, renamed or moved it would be useful to be able to automatically detect and update file associations.

FileSystemWatcher can provide events that inform of these sorts of changes, however it uses a memory buffer that can be easily filled (and events lost) if many file system events occur quickly.

A hash is no use because the content of the file can change, and so the hash will change.

I had thought of using the file creation date, however there are a few situations where this will not be unique (ie. when multiple files are copied).

I've also heard of a file SID (security ID?) in NTFS, but I'm not sure if this would do what I'm looking for.

Any ideas?

+1  A: 

Please take a look here: Unique file ids for Windows. This is also useful: Unique ID for files on NTFS?

Rubens Farias
+4  A: 

If you call GetFileInformationByHandle, you'll get a file ID in BY_HANDLE_FILE_INFORMATION.nFileIndexHigh/Low. This index is unique within a volume, and stays the same even if you move the file (within the volume) or rename it.

If you can assume that NTFS is used, you may also want to consider using Alternate Data Streams to store the metadata.

Mattias S
Thanks for the link. I actually found another API call that returns the same ID but requires a bit more work. I might post the code for I came up with. An Alternate Data Stream could be useful too as the only place Ill be encountering FAT is on USB keys / external drives. However I've heard some anti-virus / security software may have an issue with adding hidden data to files.
Ash
The documentation for GetFileInformationByHandle says: "nFileIndexLow: Low-order part of a unique identifier that is associated with a file. This value is useful ONLY WHILE THE FILE IS OPEN by at least one process. If no processes have it open, the index may change the next time the file is opened."
Integer Poet
+3  A: 

Here's sample code that returns a unique File Index.

ApproachA() is what I came up with after a bit of research. ApproachB() is thanks to information in the links provided by Mattias and Rubens. Given a specific file, both approaches return the same file index (during my basic testing).

Some caveats from MSDN:

Support for file IDs is file system-specific. File IDs are not guaranteed to be unique over time, because file systems are free to reuse them. In some cases, the file ID for a file can change over time.

In the FAT file system, the file ID is generated from the first cluster of the containing directory and the byte offset within the directory of the entry for the file. Some defragmentation products change this byte offset. (Windows in-box defragmentation does not.) Thus, a FAT file ID can change over time. Renaming a file in the FAT file system can also change the file ID, but only if the new file name is longer than the old one.

In the NTFS file system, a file keeps the same file ID until it is deleted. You can replace one file with another file without changing the file ID by using the ReplaceFile function. However, the file ID of the replacement file, not the replaced file, is retained as the file ID of the resulting file.

The first bolded comment above worries me. It's not clear if this statement applies to FAT only, it seems to contradict the second bolded text. I guess further testing is the only way to be sure.

[Update: in my testing the file index/id changes when a file is moved from one internal NTFS hard drive to another internal NTFS hard drive.]

    public class WinAPI
    {
        [DllImport("ntdll.dll", SetLastError = true)]
        public static extern IntPtr NtQueryInformationFile(IntPtr fileHandle, ref IO_STATUS_BLOCK IoStatusBlock, IntPtr pInfoBlock, uint length, FILE_INFORMATION_CLASS fileInformation);

        public struct IO_STATUS_BLOCK
        {
            uint status;
            ulong information;
        }
        public struct _FILE_INTERNAL_INFORMATION {
          public ulong  IndexNumber;
        } 

        // Abbreviated, there are more values than shown
        public enum FILE_INFORMATION_CLASS
        {
            FileDirectoryInformation = 1,     // 1
            FileFullDirectoryInformation,     // 2
            FileBothDirectoryInformation,     // 3
            FileBasicInformation,         // 4
            FileStandardInformation,      // 5
            FileInternalInformation      // 6
        }

        [DllImport("kernel32.dll", SetLastError = true)]
        public static extern bool GetFileInformationByHandle(IntPtr hFile,out BY_HANDLE_FILE_INFORMATION lpFileInformation);

        public struct BY_HANDLE_FILE_INFORMATION
        {
            public uint FileAttributes;
            public FILETIME CreationTime;
            public FILETIME LastAccessTime;
            public FILETIME LastWriteTime;
            public uint VolumeSerialNumber;
            public uint FileSizeHigh;
            public uint FileSizeLow;
            public uint NumberOfLinks;
            public uint FileIndexHigh;
            public uint FileIndexLow;
        }
  }

  public class Test
  {
       public ulong ApproachA()
       {
                WinAPI.IO_STATUS_BLOCK iostatus=new WinAPI.IO_STATUS_BLOCK();

                WinAPI._FILE_INTERNAL_INFORMATION objectIDInfo = new WinAPI._FILE_INTERNAL_INFORMATION();

                int structSize = Marshal.SizeOf(objectIDInfo);

                FileInfo fi=new FileInfo(@"C:\Temp\testfile.txt");
                FileStream fs=fi.Open(FileMode.Open,FileAccess.Read,FileShare.ReadWrite);

                IntPtr res=WinAPI.NtQueryInformationFile(fs.Handle, ref iostatus, memPtr, (uint)structSize, WinAPI.FILE_INFORMATION_CLASS.FileInternalInformation);

                objectIDInfo = (WinAPI._FILE_INTERNAL_INFORMATION)Marshal.PtrToStructure(memPtr, typeof(WinAPI._FILE_INTERNAL_INFORMATION));

                fs.Close();

                Marshal.FreeHGlobal(memPtr);   

                return objectIDInfo.IndexNumber;

       }

       public ulong ApproachB()
       {
               WinAPI.BY_HANDLE_FILE_INFORMATION objectFileInfo=new WinAPI.BY_HANDLE_FILE_INFORMATION();

                FileInfo fi=new FileInfo(@"C:\Temp\testfile.txt");
                FileStream fs=fi.Open(FileMode.Open,FileAccess.Read,FileShare.ReadWrite);

                WinAPI.GetFileInformationByHandle(fs.Handle, out objectFileInfo);

                fs.Close();

                ulong fileIndex = ((ulong)objectFileInfo.FileIndexHigh << 32) + (ulong)objectFileInfo.FileIndexLow;

                return fileIndex;   
       }
  }
Ash
A: 

what is memPtr in this code?

Ashwin Upadhyay