views:

814

answers:

6

How do I build up an atomic file write operation? The file is to be written by a Java service and read by python scripts.
For the record, reads are far greater than writes. But the write happens in batches and tend to be long. The file size amounts to mega bytes.

Right now my approach is:

  • Write file contents to a temp file in same directory
  • Delete old file
  • Rename temp file to old filename.

Is this the right approach? How can avoid conditions where the old file is deleted but the new filename is yet to be renamed?

Do these programming languages (python and java) offer constructs to lock and avoid this situation?

+3  A: 

AFAIK no.

And the reason is that for such an atomic operation to be possible, there has to be OS support in the form of a transactional file system. And none of the mainstream operating system offer a transactional file system.

EDIT - I'm wrong for POSIX-compliant systems at least. The POSIX rename syscall performs an atomic replace if a file with the target name already exists ... as pointed out by @janneb. That should be sufficient to do the OP's operation atomically.

However, the fact remains that the Java File.renameTo() method is explicitly not guaranteed to be atomic, so it does not provide a cross-platform solution to the OP's problem.

Stephen C
+1: Ahhh, those pesky OS differences preventing a "cross platform" solution.
S.Lott
A: 

Try Java FileLock API

Hope, this help.

St.Shadow
Yea ... but this won't allow you to do an atomic file replace.
Stephen C
You are not fully right.With this api you can create ".lock" file and use it as semaphore.Usecase: if file is locked - python wait until it get unlocked, then start read (and lock file), after reading unlock file. When service need write data - check, if file is locked, wait until it become unlocked, lock it, write data, unlock it.
St.Shadow
A: 

You could try and use an extra file to act as a lock, but I'm not sure if that will work out ok. (It would force you to create lock-checking and retry logic at both sides, java and python)

Another solution might be to not create files at all, maybe you could make your java process listen on a port and serve data from there rather than from a file?

Simon Groenewolt
+1  A: 

At least on POSIX platforms, leave out step 3 (delete old file). In POSIX, rename within a filesystem is guaranteed to be atomic, and renaming on top of an existing file replaces it atomically.

janneb
A: 

Have the python scripts request permission from the service. While the service is writing it would place a lock on the file. If the lock exists, the service would reject the python request.

Ron
+2  A: 

It's a classic producer/consumer problem. You should be able to solve this by using file renaming, which is atomic on POSIX systems.

gruszczy