views:

41

answers:

1

I'm writing a unit test with Boost.Unit, and I would like to include basic tests for deadlocks in the code I'm testing. My first idea was to set a deadline timer in one thread while running the test in another that is expected to finish well before the deadline. When the timer goes off, assert that the thread is not running or not interruptible. In what ways could I detect deadlocks more precisely?

+2  A: 

One question is, are you testing for actual deadlocks (i.e. to see if a deadlock HAS happened) or potential deadlocks (i.e. to see if a deadlock COULD happen)?

If you only care about detecting actual deadlocks, then something like what you describe can work. However, I'm not sure that will be all that useful, since no matter how many times you run your test, there will always be the possibility that a deadlock might still occur in the future, if the inter-thread timing ends up just exactly wrong. This is an area where multithreaded programming differs from single-threaded programming: in a multithreaded program, successfully running the program once (or even a million times) does not prove it's correct.

The only way to guarantee that your code won't deadlock is to verify that whenever threads hold more than one lock at a time, they all aquire the locks in the same order. The easiest way to do that is to make sure no thread ever holds more than one lock at a time, but that's not always possible. Given that, another approach is to simply eyeball the code until you've proven to your own satisfaction that a single locking order is followed in all cases. But that's not always practical either, especially if the code is complicated. (btw, it's always good to make multithreaded code as brutally simple as you possibly can, for precisely this sort of reason).

If eyeballing the code isn't sufficient, the last thing you can do is instrument your lock-acquisitions: The easiest way to do this (if your code can run under Linux) is to run your code under helgrind. If you can't do that, an alternate method is to wrap your lock/unlock calls in a function that logs which thread was locking/unlocking which mutex, and later on, parse the log to detect lock-ordering inconsistencies "post mortem".

Jeremy Friesner