views:

1059

answers:

4

I realise that FileSystemWatcher does not provide a Move event, instead it will generate a separate Delete and Create events for the same file. (The FilesystemWatcher is watching both the source and destination folders).

However how do we differentiate between a true file move and some random creation of a file that happens to have the same name as a file that was recently deleted?

Some sort of property of the FileSystemEventArgs class such as "AssociatedDeleteFile" that is assigned the deleted file path if it is the result of a move, or NULL otherwise, would be great. But of course this doesn't exist.

I also understand that the FileSystemWatcher is operating at the basic Filesystem level and so the concept of a "Move" may be only meaningful to higher level applications. But if this is the case, what sort of algorithm would people recommend to handle this situation in my application?

Update based on feedback:

The FileSystemWatcher class seems to see moving a file as simply 2 distinct events, a Delete of the original file, followed by a Create at the new location.

Unfortunately there is no "link" provided between these events, so it is not obvious how to differentiate between a file move and a normal Delete or Create. At the OS level, a move is treated specially, you can move say a 1GB file almost instantaneously.

A couple of answers suggested using a hash on files to identify them reliably between events, and I will proably take this approach. But if anyone knows how to detect a move more simply, please leave an answer.

+2  A: 

I'll hazard a guess 'move' indeed does not exist, so you're really just going to have to look for a 'delete' and then mark that file as one that could be 'possibly moved', and then if you see a 'create' for it shortly after, I suppose you can assume you're correct.

Do you have a case of random file creations affecting your detection of moves?

Noon Silk
@silky, that's waht I'm thinking too. The application is indexing files for searching. These files can be manually moved by the user or simply new files adding (Created) to the folder tree. There is also nothing stopping the user creating a file with exactly the same name, but in different watched folders.
Ash
Okay, then it that cause I may also consider a hashing scheme. This way, if you notice a new file with the same name; you can trivially compare sizes, and if sizes are the same, compute hashes and compare. This will allow you to know to a reasonable degree (reasonable enough, given the indexing, because if it's got the same contents then well it's basically okay to consider it 'moved' :P
Noon Silk
Just a thought: If you're going for a hashing scheme, you could capture "copied" files as well like this.
Aviad Ben Dov
Thanks, That's exactly what I plan to do. However it would have been handy to have this reported by the FileSystemWatcher, oh well. Some of the files may be large-ish (>50MB), so it would have saved doing hash calculations. I may need to look at some sort of CRC as I believe these can be quicker then hashing.
Ash
Keep in mind that you won't need to be hashing all the time; so it should be a fairly insignificant cost.
Noon Silk
+2  A: 

As far as I understand it, the Renamed event is for files being moved...?

My mistake - the docs specifically say that only files inside a moved folder are considered "renamed" in a cut-and-paste operation:

The operating system and FileSystemWatcher object interpret a cut-and-paste action or a move action as a rename action for a folder and its contents. If you cut and paste a folder with files into a folder being watched, the FileSystemWatcher object reports only the folder as new, but not its contents because they are essentially only renamed.

It also says about moving files:

Common file system operations might raise more than one event. For example, when a file is moved from one directory to another, several OnChanged and some OnCreated and OnDeleted events might be raised. Moving a file is a complex operation that consists of multiple simple operations, therefore raising multiple events.

Aviad Ben Dov
this is only true for folders (according to the docs)
Nader Shirazie
The docs specifically say "file or directory"... Maybe I'm missing something?
Aviad Ben Dov
@Aviad, A file move in my application generates a deleted and created event with the full file path of the file that was deleted passed to the Delete event, and the full path of the newly created file passed to the Create event. I don't receive a rename event in this scenarion.
Ash
At all? I would expect you to get a `OnRenamed` event from the original path, with its `Args` pointing to the new path..
Aviad Ben Dov
"The operating system and FileSystemWatcher object interpret a cut-and-paste action or a move action as a rename action for a folder and its contents". This is under the "Copying and moving Folders" section. Don't see where renamed refers to files. Anyone tested this?
Nader Shirazie
@nader: just added it to the answer too - looked it up and found it.. :(
Aviad Ben Dov
Added another bit of quote - supposed to send a `Changed` type event as well.. (Wish I could test it but not near a dev station..)
Aviad Ben Dov
@Aviad, it also depends on the NotifyFilters used as to what actual events are raised. The Changed event is often triggered where it may not be expected, however it does not tell me anything more then a Delte or Create anyway (it uses the same FileSystemEventArgs class).
Ash
+1  A: 

Might want to try the OnChanged and/or OnRenamed events mentioned in the documentation.

andymeadows
@andymeadows, moving a file seems to generate a Delete event for the original file and a Create for the same file in the new location. I don't see any Changed and Renamed events.
Ash
What are you trying to do that would warrant input or relocation of a file for processing in two separate places? I would typically move files from input -> processing -> processed/error and am hesitant to provide more advice w/o understanding why your design necessitates this.
andymeadows
It's a search indexing application that for example might be watching the users "My Documents" folder tree for all DOC files. The user can manually move files within this tree, or move files into and out of this tree. They can also simply delete files of course. Differentiating between these scenarious would be useful to avoid unnecessary re-indexing of files.
Ash
Could use a combination of filesize and create date to generate a hash for the documents since name can change. This wouldn't be truly unique and you would have to expand it based on what the usage is predicted to be. I'm sure there's other meta you could combine to make a unique key. I just paged down and this is inline with what silky said as well. You'll have to watch both events and find some unique values for your hash.
andymeadows
+2  A: 

According to the docs:

Common file system operations might raise more than one event. For example, when a file is moved from one directory to another, several OnChanged and some OnCreated and OnDeleted events might be raised. Moving a file is a complex operation that consists of multiple simple operations, therefore raising multiple events.

So if you're trying to be very careful about detecting moves, and having the same name is not good enough, you will have to use some sort of heuristic. For example, create a "fingerprint" using file name, size, last modified time, etc for files in the source folder. When you see any event that may signal a move, check the "fingerprint" against the new file.

Nader Shirazie
@nader, yes I can do store a Hash for each file, so I can use that as the fingerprint. So in my work queue I can check for a Delete event and wait for a subsequent Create event on the same file (guaranteed), but then how long to wait for this follow-up Create event before treating it as a simple Delete event? It's more difficult then expected.
Ash
Can you get the size of the file and drive in the delete event? If so, you can use this information to decide. A move in the same drive is typically as-good-as immediate. So maybe way 5 seconds. But to a different drive, it's as slow as copying, to decide based on size. Not easy clearly, but arguably fun :)
Noon Silk
@ash, You'll probably need to test a few different scenarios to come up with a good answer for that question. As silky says, within a drive is very quick (change some metadata about a file/folder) while between drives requires a copy to take place (which may take time).One question is, does the delete occur immediately (as soon as you do the move), or only after the copy takes place (in the latter case, the delete/create probably aren't that far apart, in the former, you'll have to wait quite awhile).
Nader Shirazie
@nader, @silky, thanks for your feedback. Yes I'll just have to do some testing on this I think. You've also helped me to confirm my approach to this.
Ash
@ash, np, glad to help
Nader Shirazie