Hi, I was wondering if anyone knows of a nice survey of debugging techniques for multithreaded applications. Ideally, I'm looking for a case-based analysis: deadlocks, starvation, corrupted shared state, ...
.Net specific, or generic.
Hi, I was wondering if anyone knows of a nice survey of debugging techniques for multithreaded applications. Ideally, I'm looking for a case-based analysis: deadlocks, starvation, corrupted shared state, ...
.Net specific, or generic.
Not what you are asking for but maybe you find CHESS interesting.
You could also take a look at Intel's Thread Checker or Thread Profiler and Sun's Studio Thread Analyzer, though they are not free. Also check out this article from Intel.
I've used Helgrind a subtool of Valgrind. Helgrind is a thread error detector and I've used it once or twice to detect race conditions in some of my code. It can detect the following things.
http://valgrind.org/docs/manual/hg-manual.html
Obviously only linux tool for system programs, C / C++. No Java or .NET.
I'm not aware of an article or book that addresses what you're looking for, so here's my "lessons learned" from 12 years of multithreaded debugging on Windows (both unmanaged and managed).
As I stated in my comment, most of my "multithreaded debugging" is actually done via a manual code review, looking for these issues.
Deadlocks and Corrupted Shared State
Document lock hierarchies (both the order and what shared state they protect), and ensure they're consistent. This solves most deadlock problems and corrupted shared state problems.
(Note: the link above for "lock hierarchies" refers to a Dr. Dobbs article by Herb Sutter; he's written a whole series of Effective Concurrency articles that I highly recommend).
More on Deadlocks
Use RAII for all synchronization. This ensures that locks are released in the face of exceptions. Prefer the "lock" statement to try/finally.
(Note that RAII in .NET depends on IDisposable
, not Finalize
, and assumes that the client code will correctly use a using
block).
Starvation
Remove any modifications of thread priorities. Correct prioritization is actually a bit counter-intuitive: it is best to give the thread with the most work to do a lower priority, and give higher priorities to threads that are I/O bound (including the UI thread). Since Windows does this automatically (see Windows Internals), there's really no reason for the code to get involved at all.
In General
Remove all lock-free code that was written in-house. It almost certainly contains subtle bugs. Replace it with .NET 4 lock-free collections and synchronization objects, or change the code to be lock-based.
Use higher-level concepts for synchronization. The Task Parallel Library and unified cancellation in .NET 4 remove pretty much any need for direct usage of ManualResetEvent
, Monitor
, Semaphore
, etc.
Use higher-level concepts for parallelization. The TPL and PLINQ in .NET 4 have built-in self-balancing algorithms complete with intelligent partitioning and work-stealing queues to provide optimum parallelization automatically. For the few rare cases that the automatic parallelization is sub-optimal, both TPL and PLINQ expose a huge number of tweakable knobs (custom partitioning schemes, long-running operation flags, etc).
There is one more technique I've found useful for any class that has its methods called by different threads: document which methods run on which threads. Usually, this is added as a comment to the top of the method. Ensure each method only runs in a known thread context (e.g., "on a UI thread" or "on a ThreadPool thread" or "on the dedicated background thread"). None of the methods should say "on any thread" unless you're writing a synchronization class (and if you're writing a synchronization class, ask yourself if you really should be doing that).
Lastly, name your threads. This helps easily distinguish them when using the VS debugger. .NET supports this via the Thread.Name
property.