views:

21

answers:

1

In a POSIX environment, I want to remove a file from disk, but calculate its checksum before removing it, to make sure it was not changed. Is locking enough? Should I open it, unlink, calculate checksum, and then close it (so the OS can remove its inode)? Is there any way to ensure no other process has an open file descriptor on the file?

To give a bit of context, the code performs synchronization of files across hosts, and there's an opportunity for data loss if a remote host removes a file but the file is being changed locally.

+3  A: 

Your proposal of open,unlink,checksum,close won't work as is, because you'll be stuck if the checksum doesn't match (there is no POSIX-portable way of creating a link to a file given by a file descriptor). A better variant is rename,checksum,unlink,close, which lets you undo the rename or redo the copy if the checksum doesn't match. You'll still need to think of what you want to do if a third program has recreated the file in the meantime.

POSIX offers only cooperative locks. If you have control over the programs that may modify the file, make sure they use locks; if that's not an option, you're stuck without locks.

There is no portable way to see what (or even whether) processes have opened a file. On most Unix systems, lsof will show you, but this is not universal, not robust (a program could open the files just after lsof has finished looking), and incomplete (if the files are exported over NFS, there may be no way to know about active clients).

You may benefit from looking at what other synchronization programs are doing, such as rsync and unison.

Gilles