views:

133

answers:

2

Here is simplified version of my requirement

I have a java class say Processor that contains a method say bigProcess() all it does is connect to a file server, download a specified file once that is done save the file in DB and after that update some of the DB fields in different tables.

For each of the subtasks like download file, save in DB, update fields in t1 etc it uses different methods.

The processor class is invoked every 2 hours and it has to process say around 30 to 40 requests for each invocation. To improve the perfromance I am planning to span a new thread for each request (30 to 40 threads here) and each thread calls the bigProcess method.

Now my question is do I need to synchronize any of the code blocks in bigProcess() method (here I am worried about update fields methods. Some of the update methods are do lock a row like selecte f1,f2,f3 from t1 for update, sets the values for fields f1,f2 & f3 and issue commit)

NOTE : The method bigProcess() does not use any instance variables of class Processor.

+3  A: 

Make BigProcess a Callable. When you submit it to an Executor or ExecutorService you get back a Future. If you do future.get() in the 30-40 threads, those threads will block until the Callable completes. Or if the Callable has completed, they will return the result immediately.

An alternative way to do this (that I quite like) is to create a thread pool, submit all the work to the thread pool. Once all the work is submitted, shutdown and await termination. It looks something like this:

ExecutorService threadPool = Executors.newFixedThreadPool(40);
// submit work
threadPool.shutdown();
try {
  threadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
  // do something
}

If you have dependent work (like task B cannot be done until task A completes) then create task B with a Future from task A and so on.

I like this approach because everything is transient. For a single load from the database, all the processes will be created, run and thrown away. When you start creating persistent thread pool you introduce another potential problem and it's harder to work out what's going on.

cletus
Thanks for comments. Your solution is suitable for JDK 1.5 and above. But I am using JDK 1.4.2. Can you please provide an alternate way.
Eager Learner
I haven't tried this one out myself, but http://commons.apache.org/sandbox/threadpool/ might help you with a solution similar to the one above.
Buhb
+1  A: 

Whether you need to synchronize your methods depends on what these methods actually do. Generally you need to synchronize if there are resources that are used from multiple threads, such as a single file or a single table in a database (that you are actually writing to and reading from). If all the processes you’re running do not interefere with one another there’s no need for synchronization.

Bombe
Yes the threads do use common tables say t1 and t2. But no two threads do operate in same rows of any table. Do we need to synchronize in that case?
Eager Learner
Again, that depends. If one of the threads during processing can leave the database in an inconsistent state that makes another thread not work correctly, you need to synchronize. If the order in which the threads update the database does not matter then synchronization might not be necessary.
Bombe