views:

1434

answers:

6

One of the hardest things for me to initially adjust to was my first intense experience programming with pthreads in C. I was used to knowing exactly what the next line of code to be run would be and most of my debugging techniques centered around that expectation.

What are some good techniques to debugging with pthreads in C? You can suggest personal methodologies without any added tools, tools you use, or anything else that helps you debug.

P.S. I do my C programming using gcc in linux, but don't let that necessarily restrain your answer

+2  A: 

Debugging a multithreaded application is difficult. A good debugger such as GDB (with optional DDD front end) for the *nix environment or the one that comes with Visual Studio on windows will help tremendously.

luke
DDD is just a graphical frontend for gdb (and other debuggers); it's not actually a debugger itself
Adam Rosenfield
Right you are. Corrected.
luke
A: 

I tend to use lots of breakpoints. If you don't actually care about the thread function, but do care about it's side effects a good time to check them might be right before it exits or loops back to it's waiting state or whatever else it's doing.

kbyrd
+3  A: 

One of the things that will suprise you about debugging threaded programs is that you will often find the bug changes, or even goes away when you add printf's or run the program in the debugger (colloquially known as a Heisenbug).

In a threaded program, a Heisenbug usually means you have a race condition. A good programmer will look for shared variables or resources that are order-dependent. A crappy programmer will try to blindly fix it with sleep() statements.

T.E.D.
+8  A: 

Valgrind is an excellent tool to find race conditions and pthreads API misuses. It keeps a model of program memory (and perhaps of shared resources) accesses and will detect missing locks even when the bug is benign (which of course means that it will completely unexpectedly become less benign at some later point).

To use it, you invoke valgrind --tool=helgrind, here is its manual. Also, there is valgrind --tool=drd (manual). Helgrind and DRD use different models so they detect overlapping but possibly different set of bugs. False positives also may occur.

Anyway, valgrind has saved countless hours of debugging (not all of them though :) for me.

Laurynas Biveinis
+1 for Valgrind
Andrew Coleson
Yes, and use a --suppressions file to ignore false positives. This is particularly necessary if you are using C++ STL.
rleir
A: 

My approach to multi-threaded debugging is similar to single-threaded, but more time is usually spent in the thinking phase:

  1. Develop a theory as to what could be causing the problem.

  2. Determine what kind of results could be expected if the theory is true.

  3. If necessary, add code that can disprove or verify your results and theory.

  4. If your theory is true, fix the problem.

Often, the 'experiment' that proves the theory is the addition of a critical section or mutex around suspect code. I will then try to narrow down the problem by systematically shrinking the critical section. Critical sections are not always the best fix (though can often be the quick fix). However, they're useful for pinpointing the 'smoking gun'.

Like I said, the same steps apply to single-threaded debugging, though it is far too easy to just jump into a debugger and have at it. Multi-threaded debugging requires a much stronger understanding of the code, as I usually find the running multi-threaded code through a debugger doesn't yield anything useful.

Also, hellgrind is a great tool. Intel's Thread Checker performs a similar function for Windows, but costs a lot more than hellgrind.

Marc Bernier
A: 

In the 'thinking' phase, before you start coding, use the State Machine concept. It can make the design much clearer.

printf's can help you understand the dynamics of your program. But they clutter up the source code, so use a macro DEBUG_OUT() and in its definition enable it with a boolean flag. Better still, set/clear this flag with a signal that you send via 'kill -USR1'. Send the output to a log file with a timestamp.

also consider using assert(), and then analyze your core dumps using gdb and ddd.

rleir