tags:

views:

334

answers:

8

I am monitoring a folder for new files and need to process them. The problem is that occasionally file opening fails, because system has not finished copying it.

What is the correct way to test if the file is finished copying?

Clarification: I don't have write permissions to the folder/files and cannot control the copying process (it's the user).

A: 

one approach that i take always is to create a file in the end of my copy/transfer named "token.txt" without content. The idea is that this file will be created just in the end of the transfer operation, so you can monitor this file creation and when this file is created, you start to work with your files. Don't forget to erase this token file always when you start to process your files.

VP
But if the user account has no right to delete files in server then this approach would be of no use.
rahul
Don't think extropy is waiting for a copy process he himself as control over. So then there would be no token file, right?
peSHIr
i think you cannot say if he has or not access/control without a process without more details. It is like a brainstorm where everybody give inputs.
VP
A: 

You should also cover cases like: file is in use by other program, file was deleted (copy didn't succeed) etc..

Use an extended exception handling to cover all important cases that might occur.

Chris
+2  A: 

Not sure about "the correct way", but you could use the monitoring tool (FileSystemWatcher I guess) to fill an internal queue that you use for delayed processing. Or better yet: just use a queue to place files in that had the open fail, so you can retry them later.

peSHIr
+1  A: 

If you are using FileSystemWatcher I don't think there's a robust solution to this problem. One approach would be try/catch/retry later.

Darin Dimitrov
A: 

It depends, a retry loop is probably the best you can do, if you have no control over the copy process.

If you do have control:

  • If the folder is local, you could require that the people writing stuff into it lock the file for exclusive access, and only release the lock when they are done (which I think is default for File.Copy). On the .Net side you could have a simple retry loop, with a cool down period.
    • Alternatively you can write the file to a temp folder and only after written move it to the target dir. This reduces the window where bad stuff can happen (but does not eliminate it)
  • If the folder is an SMB share, there is a chance LockFile does not even work (some linux implementations). In that case the common approach is to have a sort of lock file, that is deleted once the person that creates the file is done. The problem with a lock file approach is that if you forget to delete it you can be in trouble.
  • In wake of these complications I would recommend that receiving the data via a WCF service or a web service may be advantageous cause you have much better control.
Sam Saffron
+6  A: 

I think the only sure way to do this is by trying to open the file exclusively and catching a specific exception. I usually hate using exceptions for normal application logic, but I'm afraid for this scenario there's no other way (at least I haven't found one yet):

public bool FileIsDone(string path)
{
  try
  {
    using (File.Open(path, FileMode.Open, FileAccess.Read, FileShare.None))
    {
    }
  }
  catch(UnauthorizedAccessException)
  {
    return false;
  }

  return true;
}
Philippe Leybaert
A: 

In fact, to avoid race conditions, the only safe solution is to retry.

If you do something like:

while (file is locked)
    no-op()
process file()

You risk another process jumping in between the while guard and the process file statement. No matter how your "wait for file availability" is implemented, unless you can ensure that post-unlock you're the first process to access it, you might not be that first user.

This is more likely that might seem at first glance, in particular if multiple people are watching the file, and in particular if they're using something like the file-system watcher. Course, it's still not particularly likely even then...

Eamon Nerbonne
A: 

Are the files big?

Maybe you could try to calculate a the md5 checksum on the file?

If you put the md5 hash in the filename you could retrieve it and try to recalculate the checksum on the file. When the md5 is a match you could assume that the file is finished.

byte[] md5Hash = null;
MD5 md5 = new MD5CryptoServiceProvider();
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
   md5Hash = md5.ComputeHash(fs);

StringBuilder hex = new StringBuilder();
foreach (byte b in md5Hash)
    hex.Append(b.ToString("x2"));
Makach