views:

1143

answers:

6

Here's the scenario: I have a multi threaded java web application which is running inside a servlet container. The application is deployed multiple times inside the servlet container. There are multiple servlet containers running on different servers.

Perhaps this graph makes it clear:

server1
+- servlet container
   +- application1
   |  +- thread1
   |  +- thread2
   +- application2
      +- thread1
      +- thread2
server2
+- servlet container
   +- application1
   |  +- thread1
   |  +- thread2
   +- application2
      +- thread1
      +- thread2

There is a file inside a network shared directory which all those threads can access. And they do access the file frequently. Most of the time the file is only read by those threads. But sometimes it is written.

I need a fail safe solution to synchronize all those threads so data consistency is guaranteed.


Solutions which do not work (properly):

  1. Using java.nio.channels.FileLock
    I am able to synchronize threads from different servers using the FileLock class. But this does not work for threads inside the same process (servlet container) since file locks are available process wide.

  2. Using a separate file for synchronization
    I could create a separate file which indicates that a process is reading from or wrinting to the file. This solution works for all threads but has several drawbacks:

    • Performance. Creating, deleting and checking files are rather slow operations. The low weight implementations with one synchronization file will prevent parallel reading of the file.
    • The synchronization file will remain after a JVM crash making a manual clean up necessary.
    • We have had already strange problems deleting files on network file systems.
  3. Using messaging
    We could implement a messaging system which the threads would use to coordinate the file access. But this seems too complex for this problem. And again: performance will be poor.

Any thoughts?

+2  A: 

You've enumerated the possible solutions except the obvious one: remove the dependency on that file

Is there another way for the threads to obtain that data instead of reading it from a file? How about setting up some kind of process who is responsible for coordinating access to that information instead of having all the threads read the file.

Kevin
Good point Kevin. The processes are some kind of database servers and the file is the database source. When the database source is updated the processes have to reload the file. The processes do not know about each other, and I'd like to keep it so for the sake of simplicity.
Eduard Wirch
+1  A: 

A. Sounds like it's time for a database :-). Rather than having a shared file what about storing the data in a database.

B. Alternatively - layering:

  1. Lock threads within a process with a standard synchronized lock.
  2. Lock inter-process/machine with a file-based lock type thing - e.g. create a directory to hold the lock.

Nest 2 inside 1.

Still has the clean-up problem.

C. Alternatively some kind of write to new file/rename strategy so that reader don't need to lock maybe?

Douglas Leeder
A: 

Using java.nio.channels.FileLock, with a ReadWriteLock.

If I were you, I'd hide the File, FileChannel and all FileOutputStream from all business code. Replaced with my own simple adapter class, like an DAO.

e.g.

abstract class MyWriter{
    private FileChannel file;
    public void writeSomething(byte[] b){
        // get VM scope write lock here
        // get file lock here
        // do write
        // release file lock
        // release readwritelock lock
    }
}
Dennis Cheung
Doesn't work for the synchronization of threads inside one process.
Eduard Wirch
+1  A: 

Would you be able to employ the use of Semaphore to control access as one at time within the application?

To quote the API "Semaphores are often used to restrict the number of threads than can access some (physical or logical) resources"

Whilst that API might remain container specific, the concept of a distributed Semaphore should be achievable, possibly with JGroups.

A cursory search on Google for 'distributed java semaphore' turned up Jukebox which looks like it could address the above

j pimmel
+2  A: 

The most simple solution is to create another process (web service or whatever is most simple to you). Only this process reads/writes the file and it listens to read/write requests by the other services.

While it might seem that this is slower than using the network share directly, that's not necessarily the case: Using a network share means to use a client/server which is built into your OS (which does exactly that: send read/write requests to the server which offers the share).

Since your service is optimized for the task (instead of being a general "serve file" service), it might even be faster.

Aaron Digulla
+1  A: 

If you only need to write the file rarely, how about writing the file under a temporary name and then using rename to make it "visible" to the readers?

This only works reliably with Unix file systems, though. On Windows, you will need to handle the case that some process has the file open (for reading). In this case, the rename will fail. Just try again until the rename succeeds.

I suggest to test this thoroughly because you might run into congestion: There are so many read requests that the writer task can't replace the file for a long time.

If that is the case, make the readers check for the temporary file and wait a few moments with the next read until the file vanishes.

Aaron Digulla
This is a good point. We already use temporary files for locking (kind of file-semaphores). But your approach has the benefit that reading will not be slowed down. And the problem of remaining temporary files is reduced to the small amount of time between deleting and renaming.
Eduard Wirch