views:

3437

answers:

17

Hot-on-the-heels of of my previous unit testing related question, here's another toughie:

I have thus far avoided the nightmare that is testing multi-threaded code since it just seems like too much of a minefield. I'd like to ask how people have gone about testing code that relies on threads for successful execution, or just how people have gone about testing those kinds of issues that only show up when two threads interact in a given manner?

This seems like a really key problem for programmers today, it would be useful to pool our knowledge on this one imho.

A: 

I have had the unfortunate task of testing threaded code and it is definately the hardest tests I have ever written.

When writing my tests, I used a combination of delegates and events. Basically it is all about using PropertyNotifyChanged events with a WaitCallback or some kind of ConditionalWaiter that polls.

I am not sure if this was the best approach, but it has worked out for me.

Dale Ragan
+16  A: 

Tough one indeed! In my (C++) unit tests, I've broken this down into several categories along the lines of the concurrency pattern used:

  1. Unit tests for classes that operate in a single thread and aren't thread aware -- easy, test as usual.

  2. Unit tests for Monitor objects (those that execute synchronized methods in the callers' thread of control) that expose a synchronized public API -- instantiate multiple mock threads that exercise the API. Construct scenarios that exercise internal conditions of the passive object. Include one longer running test that basically beats the heck out of it from multiple threads for a long period of time. This is unscientific I know but it does build confidence.

  3. Unit tests for Active objects (those that encapsulate their own thread or threads of control) -- similar to #2 above with variations depending on the class design. Public API may be blocking or non-blocking, callers may obtain futures, data may arrive at queues or need to be dequeued. There are many combinations possible here; white box away. Still requires multiple mock threads to make calls to the object under test.

As an aside:

In internal developer training that I do, I teach the Pillars of Concurrency and these two patterns as the primary framework for thinking about and decomposing concurrency problems. There's obviously more advanced concepts out there but I've found that this set of basics helps keep engineers out of the soup. It also leads to code that is more unit testable, as described above.

David Joyner
+39  A: 

Look, there's no easy way to do this. I'm working on a project that is inherently multithreaded. Events come in from the operating system and I have to process them concurrently.

The simplest way to deal with testing complex, multithhreaded application code is this: If its too complex to test, you're doing it wrong. If you have a single instance that has multiple threads acting upon it, and you can't test situations where these threads step all over each other, then your design needs to be redone. Its both as simple and as complex as this.

There are many ways to program for multithreading that avoids threads running through instances at the same time. The simplest is to make all your objects immutable. Of course, that's not usually possible. So you have to identify those places in your design where threads interract with the same instance and reduce the number of those places. By doing this, you isolate a few classes where multithreading actually occurs, reducing the overall complexity of testing your system.

But you have to realize that even by doing this you still can't test every situation where two threads step on each other. To do that, you'd have to run two threads concurrently in the same test, then control exactly what lines they are executing at any given moment. The best you can do is simulate this situation. But this might require you to code specifically for testing, and that's at best a half step towards a true solution.

Probably the best way to test code for threading issues is through static analysis of the code. If your threaded code doesn't follow a finite set of thread safe patterns, then you might have a problem. I believe Code Analysis in VS does contain some knowledge of threading, but probably not much.

Look, as things stand currently (and probably will stand for a good time to come), the best way to test multithreaded apps is to reduce the complexity of threaded code as much as possible. Minimize areas where threads interact, test as best as possible, and use code analysis to identify danger areas.

Will
+3  A: 

Pete Goodliffe has a series on the unit testing of threaded code.

It's hard. I take the easier way out and try to keep the threading code abstracted from the actual test. Pete does mention that the way I do it is wrong but I've either got the separation right or I've just been lucky.

graham.reeds
I read the two articles published so far, and I didn't find them very helpful. He just talks about the difficulties without giving much concrete advice. Maybe future articles will improve.
Don Kirkby
Great link! Nice avatar too :P
Andrei Rinea
+2  A: 

I also had serious problems testing multi- threaded code. Then I found a really cool solution in "xUnit Test Patterns" by Gerard Meszaros. The pattern he describes is called Humble object.

Basically it describes how you can extract the logic into a separate, easy-to-test component that is decoupled from its environment. After you tested this logic, you can test the complicated behaviour (multi- threading, asynchronous execution, etc...)

ollifant
link: http://xunitpatterns.com/Humble%20Object.html
Frank Schwieterman
A: 

I just came by this post in my RSS aggregator: http://blog.carbonfive.com/2008/05/testing/multithreaded-testing

svrist
+3  A: 

I like to write two or more test methods to execute on parallel threads, and each of them make calls into the object under test. I've been using Sleep() calls to coordinate the order of the calls from the different threads, but that's not really reliable. It's also a lot slower because you have to sleep long enough that the timing usually works.

I found the Multithreaded TC Java library from the same group that wrote FindBugs. It lets you specify the order of events without using Sleep(), and it's reliable. I haven't tried it yet.

The biggest limitation to this approach is that it only lets you test the scenarios you suspect will cause trouble. As others have said, you really need to isolate your multithreaded code into a small number of simple classes to have any hope of thoroughly testing them.

Once you've carefully tested the scenarios you expect to cause trouble, an unscientific test that throws a bunch of simultaneous requests at the class for a while is a good way to look for unexpected trouble.

Update: I've played a bit with the Multithreaded TC Java library, and it works well. I've also ported some of its features to a .NET version I call TickingTest.

Don Kirkby
+2  A: 

For Java, check out chapter 12 of JCIP. There are some concrete examples of writing deterministic, multi-threaded unit tests to at least test the correctness and invariants of concurrent code.

"Proving" thread-safety with unit tests is much dicier. My belief is that this is better served by automated integration testing on a variety of platforms/configurations.

Scott Bale
+5  A: 

I've done a lot of this, and yes it sucks.

Some tips:

  • GroboUtils for running multiple test threads
  • alphaWorks ConTest to instrument classes to cause interleavings to vary between iterations
  • Create a throwable field and check it in tearDown (see Listing 1). If you catch a bad exception in another thread, just assign it to throwable.
  • I created the utils class in Listing 2 and have found it invaluable, especially waitForVerify and waitForCondition, which will greatly increase the performance of your tests.
  • Make good use of AtomicBoolean in your tests. It is thread safe, and you'll often need a final reference type to store values from callback classes and suchlike. See example in Listing 3.
  • Make sure to always give your test a timeout (e.g., @Test(timeout=60*1000)), as concurrency tests can sometimes hang forever when they're broken

Listing 1:

@After
public void tearDown() {
    if ( throwable != null )
        throw throwable;
}

Listing 2:

import static org.junit.Assert.fail;
import java.io.File;
import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Proxy;
import java.util.Random;
import org.apache.commons.collections.Closure;
import org.apache.commons.collections.Predicate;
import org.apache.commons.lang.time.StopWatch;
import org.easymock.EasyMock;
import org.easymock.classextension.internal.ClassExtensionHelper;
import static org.easymock.classextension.EasyMock.*;

import ca.digitalrapids.io.DRFileUtils;

/**
 * Various utilities for testing
 */
public abstract class DRTestUtils
{
    static private Random random = new Random();

/** Calls {@link #waitForCondition(Integer, Integer, Predicate, String)} with
 * default max wait and check period values.
 */
static public void waitForCondition(Predicate predicate, String errorMessage) 
    throws Throwable
{
    waitForCondition(null, null, predicate, errorMessage);
}

/** Blocks until a condition is true, throwing an {@link AssertionError} if
 * it does not become true during a given max time.
 * @param maxWait_ms max time to wait for true condition. Optional; defaults
 * to 30 * 1000 ms (30 seconds).
 * @param checkPeriod_ms period at which to try the condition. Optional; defaults
 * to 100 ms.
 * @param predicate the condition
 * @param errorMessage message use in the {@link AssertionError}
 * @throws Throwable on {@link AssertionError} or any other exception/error
 */
static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms, 
 Predicate predicate, String errorMessage) throws Throwable 
{
 waitForCondition(maxWait_ms, checkPeriod_ms, predicate, new Closure() {
  public void execute(Object errorMessage)
  {
   fail((String)errorMessage);
  }
 }, errorMessage);
}

/** Blocks until a condition is true, running a closure if
 * it does not become true during a given max time.
 * @param maxWait_ms max time to wait for true condition. Optional; defaults
 * to 30 * 1000 ms (30 seconds).
 * @param checkPeriod_ms period at which to try the condition. Optional; defaults
 * to 100 ms.
 * @param predicate the condition
 * @param closure closure to run
 * @param argument argument for closure
 * @throws Throwable on {@link AssertionError} or any other exception/error
 */
static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms, 
 Predicate predicate, Closure closure, Object argument) throws Throwable 
{
 if ( maxWait_ms == null )
  maxWait_ms = 30 * 1000;
 if ( checkPeriod_ms == null )
  checkPeriod_ms = 100;
 StopWatch stopWatch = new StopWatch();
 stopWatch.start();
 while ( !predicate.evaluate(null) ) {
  Thread.sleep(checkPeriod_ms);
  if ( stopWatch.getTime() > maxWait_ms ) {
   closure.execute(argument);
  }
 }
}

/** Calls {@link #waitForVerify(Integer, Object)} with <code>null</code>
 * for {@code maxWait_ms}
 */
static public void waitForVerify(Object easyMockProxy)
    throws Throwable
{
    waitForVerify(null, easyMockProxy);
}

/** Repeatedly calls {@link EasyMock#verify(Object[])} until it succeeds, or a
 * max wait time has elapsed.
 * @param maxWait_ms Max wait time. <code>null</code> defaults to 30s.
 * @param easyMockProxy Proxy to call verify on
 * @throws Throwable
 */
static public void waitForVerify(Integer maxWait_ms, Object easyMockProxy)
 throws Throwable
{
 if ( maxWait_ms == null )
  maxWait_ms = 30 * 1000;
 StopWatch stopWatch = new StopWatch();
 stopWatch.start();
 for(;;) {
  try
  {
   verify(easyMockProxy);
   break;
  }
  catch (AssertionError e)
  {
   if ( stopWatch.getTime() > maxWait_ms )
    throw e;
   Thread.sleep(100);
  }
 }
}

/** Returns a path to a directory in the temp dir with the name of the given
 * class. This is useful for temporary test files.
 * @param aClass test class for which to create dir
 * @return the path
 */
static public String getTestDirPathForTestClass(Object object) 
{

    String filename = object instanceof Class ? 
        ((Class)object).getName() :
        object.getClass().getName();
    return DRFileUtils.getTempDir() + File.separator + 
        filename;
}

static public byte[] createRandomByteArray(int bytesLength)
{
    byte[] sourceBytes = new byte[bytesLength];
    random.nextBytes(sourceBytes);
    return sourceBytes;
}

/** Returns <code>true</code> if the given object is an EasyMock mock object 
 */
static public boolean isEasyMockMock(Object object) {
 try {
  InvocationHandler invocationHandler = Proxy
    .getInvocationHandler(object);
  return invocationHandler.getClass().getName().contains("easymock");
 } catch (IllegalArgumentException e) {
  return false;
 }
}
}

Listing 3:

@Test
public void testSomething() {
    final AtomicBoolean called = new AtomicBoolean(false);
    subject.setCallback(new SomeCallback() {
        public void callback(Object arg) {
            // check arg here
            called.set(true);
        }
    });
    subject.run();
    assertTrue(called.get());
}
Kevin Wong
A timeout is a good idea, but if a test times out, any later results in that run are suspect. The timed out test may still have some threads running that can mess you up.
Don Kirkby
+1  A: 

Another way to (kinda) test threaded code, and very complex systems in general is through Fuzz Testing. It's not great, and it won't find everything, but its likely to be useful and its simple to do.

Quote:

    Fuzz testing or fuzzing is a software testing technique that provides random 
    data("fuzz") to the inputs of a program. If the program fails (for example, by 
    crashing, or by failing built-in code assertions), the defects can be noted.
    The great advantage of fuzz testing is that the test design is extremely simple, and 
    free of preconceptions about system behavior. 
    ...
    Fuzz testing is often used in large software development projects that employ black
    box testing. These projects usually have a budget to develop test tools, and fuzz 
    testing is one of the techniques which offers a high benefit to cost ratio. 
    ...
    However, fuzz testing is not a substitute for exhaustive testing or formal methods: 
    it can only provide a random sample of the system's behavior, and in many cases 
    passing a fuzz test may only demonstrate that a piece of software handles exceptions 
    without crashing, rather than behaving correctly. Thus, fuzz testing can only be 
    regarded as a bug-finding tool rather than an assurance of quality.
Robert Gould
+7  A: 

It's been a while when this question was posted, but it's still not answered ...

kleolb02's answer is a good one. I'll try going into more details.

There is a way, which I practice for C# code. For unit tests you should be able to program reproducible tests, which is the biggest challenge in multithreaded code. So my answer aims toward forcing asynchronous code into a test harness, which works synchronously.

It's an idea from Gerard Meszardos's book "xUnit Test Patterns" and is called "Humble Object" (p. 695): You have to separate core logic code and anything which smells like asynchronous code from each other. This would result to a class for the core logic, which works synchronously.

This puts you into the position to test the core logic code in a synchronous way. You have absolute control over the timing of the calls you are doing on the core logic and thus can make reproducible tests. And this is your gain from separating core logic and asynchronous logic.

This core logic needs be wrapped around by another class, which is responsible for receiving calls to the core logic asynchronously and delegates these calls to the core logic. Production code will only access the core logic via that class. Because this class should only delegate calls, it's a very "dumb" class without much logic. So you can keep your unit tests for this asychronous working class at a minimum.

Anything above that (testing interaction between classes) are component tests. Also in this case, you should be able to have absolute control over timing, if you stick to the "Humble Object" pattern.

Theo Lenndorff
+2  A: 

I spent most of last week at a university library studying debugging of concurrent code. The central problem is concurrent code is non-deterministic. Typically, academic debugging has fallen into one of three camps here:

  1. Event-trace/replay. This requires an event monitor and then reviewing the events that were sent. In a UT framework, this would involve manually sending the events as part of a test, and then doing post-mortem reviews.
  2. Scriptable. This is where you interact with the running code with a set of triggers. "On x > foo, baz()". This could be interpreted into a UT framework where you have a run-time system triggering a given test on a certain condition.
  3. Interactive. This obviously won't work in an automatic testing situation. ;)

Now, as above commentators have noticed, you can design your concurrent system into a more deterministic state. However, if you don't do that properly, you're just back to designing a sequential system again.

My suggestion would be to focus on having a very strict design protocol about what gets threaded and what doesn't get threaded. If you constrain your interface so that there is minimal dependancies between elements, it is much easier.

Good luck, and keep working on the problem.

Paul Nathan
+5  A: 

Just have googled this useful article on unit testing of multi-threaded Java applications.

The JUnit 4 version of a library from that article is here: MultiThreadedTC-JUnit4.

Alex Siman
I've ported some of the features from MultiThreadedTC to a .NET library called TickingTest: http://code.google.com/p/donkirkby/source/browse/#svn/trunk/TickingTest
Don Kirkby
+3  A: 

For .NET you could check out Microsoft's research project Chess

Yrlec
"CHESS repeatedly runs a concurrent test ensuring that every run takes a different interleaving. If an interleaving results in an error, CHESS can reproduce the interleaving for improved debugging" - sounds like an interesting approach.
Thomas Ahle
Unfortunately, it's not available for the cheaper versions of Visual Studio.
Don Kirkby
A: 

In Java: The Package java.util.concurrent contains some bad known Classes, that may help to write deterministic JUnit-Tests.

Have a look at

Synox
+2  A: 

I handle unit tests of threaded components the same way I handle any unit test, that is with inversion of control and isolation frameworks. I develop in the .Net-arena and out of the box the threading (among other things) is very hard (I'd say nearly impossible) to fully isolate.

Therefore I've written wrappers that looks something like this (simplified):

public interface IThread
{
    void Start();
    ...
}

public class ThreadWrapper : IThread
{
    private readonly Thread _thread;

    public ThreadWrapper(ThreadStart threadStart)
    {
        _thread = new Thread(threadStart);
    }

    public Start()
    {
        _thread.Start();
    }
}

public interface IThreadingManager
{
    IThread CreateThread(ThreadStart threadStart);
}

public class ThreadingManager : IThreadingManager
{
    public IThread CreateThread(ThreadStart threadStart)
    {
         return new ThreadWrapper(threadStart)
    }
}

From there I can easily inject the IThreadingManager into my components and use my isolation framework of choice to make the thread behave as I expect during the test.

That has so far worked great for me, and I use the same approach for the thread pool, things in System.Environment, Sleep etc. etc.

scim
+4  A: 

There are a few tools around that are quite good. Here is a summary of some of the Java ones.

Some good static analysis tools include FindBugs (gives some useful hints), JLint, Java Pathfinder (JPF & JPF2), and Bogor.

MultithreadedTC is quite a good dynamic analysis tool (integrated into JUnit) where you have to set up your own test cases.

ConTest from IBM Research is interesting. It instruments your code by inserting all kinds of thread modifying behaviours (e.g. sleep & yield) to try to uncover bugs randomly.

SPIN is a really cool tool for modelling your Java (and other) components, but you need to have some useful framework. It is hard to use as is, but extremely powerful if you know how to use it. Quite a few tools use SPIN underneath the hood.

MultithreadedTC is probably the most mainstream, but some of the static analysis tools listed above are definitely worth looking at.

xagyg