views:

69

answers:

1

I'm writing an application that catalogs files, and attributes them with extra meta data through separate "side-car" files. If changes to the files are made through my program then it is able to keep everything in sync between them and their corresponding meta data files. However, I'm trying to figure out a way to deal with someone modifying the files manually while my program is not running.

When my program starts up it scans the file system and compares the files it finds to it's previous record of what files it remembers being there. It's fairly straight forward to update after a file has been deleted or added. However, if a file was moved or renamed then my program sees that as the old file being deleted, and the new file being added. Yet I don't want to loose the association between the file and its metadata.

I was thinking I could store a hash from each file so I could check to see if newly found files were really previously known files that had been moved or renamed. However, if the file is both moved/renamed and modified then the hash would not match either.

So is there some other unique identifier of a file that I can track which stays with it even after it is renamed, moved, or modified?

+1  A: 

There is no unique identifier for the file. The best you can use is a heuristic and difference comparison method. If the difference is small between a removed and the added file, then perhaps this was a modify + move operation. Or maybe not.

git has a pretty good file renaming/moving detector. Perhaps you can borrow some ideas from it.

Yann Ramin
Eh it's not that critical. I'll just treat a rename/move as a add/delete and tell users that if they don't want to loose the metadata they need to use my tool to manipulate files.
Eric Anastas