views:

28

answers:

1

I have old code in Java which deadlocks... I never used netbeans as a development tool... however, I need to fix the code.

I ran the application in debug mode, clicked on check for deadlock and netBeans brought a screen. Two out of four threads were in red... see the screen dump below.

I'm new to multithreading, and on the top of that code is not mine...

What's most likely causing the problem?

alt text

+3  A: 

As far as I can tell the problem is very likely related to the way in which (or more specifically the order in which) the multiple threads acquire and release locks. In the above example the two threads need access to two locks (or monitors):

  • nano.toolbox.strategies.ESMarketMaker
  • nano.toolbox.strategies.ExecutionManager

From the stack trace on the two threads currently in a deadlock, we can see that thread 'ExecutionManager' has aquired the ExecutionManager monitor but is awaiting acquisition (while still holding the 'ExecutionManager' monitor) of the 'ESMarketMaker' monitor.

The 'StrategyManager' thread on the other hand, has acquired the 'ESMarketMaker' monitor but is awaiting acqusition (while still holding the 'ESMarketMaker' monitor) of the 'ExecutionManager' monitor.

This is a class example of deadlocks and the many ways in which order of acquisition of locks can cause deadlocks.

There are many ways to address these kind of problems:

  • If possible, all threads needing some set of locks to operate, must acquire the shared locks in the same order (the inversed order is the problem in the above deadlock). But this is not always possible, as multiple threads may have only semi-overlapping lock usage in different conditions, why it may be hard or impossible to design a protocol of acquisition that will ensure uniform ordering.
  • You may also use tryLock() instead, which is a non-blocking acquisition, it returns a flag to indicate success or failure and gives you the option to do something else before re-trying. One thing I would recommend in this case, is that if acquisition fails, it is to drop all currently owned locks and try from scratch again (thus giving way for any who is blocked on any or all locks the current thread holds, to complete their work, maybe freeing the locks this thread needs when it retries).

One thing to note though, is that sometimes when deciding on the protocol to use, you need more explicit control over your locks, rather than normal synchronization in Java. In these cases, the usage of explicit ReentrantLock instances can be a benefit, as these allows you to do stuff like inspecting whether a lock is unlocked or currently locked, and do non-blocking try-locks as described above.

I hope this helps, I'm sorry I can't be more specific, but I would need to see the source code for that. :-)

(Oh an p.s., a third thing one might opt for, if deadlock is something that must be avoided by all cost, is to look into modeling tools, to model a state machine over the states of the program and locks, which can be used together with analysis tools which can check for possible deadlocks in such a model and give you examples if any such is found).

micdah
+1... but I disagree with the portion on releasing locks and reacquiring them. If two agents do this same thing, then they will likely encounter a live lock. Essentially, both agents constantly acquire the first lock, fail to acquire the second lock, release the first lock and repeat the process continuously.
Tim Bender
That is of course true, such a condition is theoretically possible, hence the solution should be used with caution as it is not bullet proof. That said though, such a scheduling (in real life), I would suspect, is highly unlikely not to say non-existent as it requires a consistent scheduling of the "wrong" interleaving indefinitely. One might call the "solution" optimistic in that it relies on a pseudo random behavior of the scheduler to ensure (given indefinite time) that a "right" interleaving is tried at some point.
micdah
Maybe a better solution, assuming one knows beforehand exactly the locks that is needed to perform an operation, is to try to acquire these in a pseudo random ordering on each retry until all are acquired, this randomness combined with non-blocking acquisitions would surely (given indefinite time) ensure that at some point the thread can acquire all required locks.
micdah