tags:

views:

346

answers:

8

I'm having a hard time understanding why there is only one test per function in most professional TDD code that I have seen. When I approached TDD initially I tended to group 4-5 tests per function if they were related but I see that doesn't seem to be the standard. I know that it is more descriptive to have just one test per function because you can more easily narrow down what the problem is, but I find myself struggling to come up with function names to differentiate the different tests since many are so similar.

So my question is: Is it truly a bad practice to put multiple tests in one function and if so why? Is there a consensus out there? Thanks

Edit: Wow tons of great answers. I'm convinced. You need to really separate them all out. I went through some recent tests I had written and separated them all and lo and behold it was way more easier to read and helped my understand MUCH better what I was testing. Also by giving the tests their own long verbose names it gave me ideas like "Oh wait I didn't test this other thing", so all around I think it's the way to go.

Great Answers. Gonna be hard to pick a winner

+7  A: 

looks like you're asking "why there is only one assertion per test in most professional TDD code I have seen". That's probably to increase test isolation, as well as test coverage in presence of failures. That's certainly the reason why I made my TDD library (for PHP) that way. say you have

function testFoo()
{
    $this->assertEquals(1, foo(10));
    $this->assertEquals(2, foo(20));
    $this->assertEquals(3, foo(30));
}

If the first assert fails, you don't get to see what would happen with the other two. That doesn't exactly help pinpoint the problem: is this something specific to the inputs, or is it systemic?

just somebody
Hmm interesting point. To change this I assume my test code is probably going to explode in size. I guess that is just the cost of doing business then.
John Baker
@John Baker: yes, the dark side of single-assertion unit tests is bloat. but it's been quite clear to me soon after I got into the TDD business that writing unit tests in the implementation language is a mistake, this business *really* calls for a decoupled DSL, something in which you can write multi-assertioin tests and keep the "I wanna see all failures" property of single-assertion tests.
just somebody
@just somebody: Wow you just blew my mind. Is there any frameworks out there for a thing such as this?
John Baker
Yeah, interesting. I second the question of whether there's a specific example that you're aware of already.
Tchalvak
just somebody
ahem. to answer your questions: I'm not aware of any unit testing tool doing exactly what I described. however, there are tools that *generate tests*. one of the stories in Beatiful Code mentions using this approach. check your local bookstore or amazon.com.
just somebody
@just somebody: another thing you can do in that scenario is refactor your tests to remove duplication, or create multiple fixtures, that do appropriate setup for a subset of tests.
kyoryu
@kyoryu: yes, but that quickly leads to test code (almost) as complex as the CUT, which is *un* desirable. complex test code is the bane of TDD; it's really a fine line to tread: in order to test complex CUT with any degree of confidence, the tests must be simpler than the CUT or you dig yourself a recursive black hole. and if you want to keep the test code *provably* (if not in the mathematical sense) correct, you must step out of the CUT implementation language. else you need to write unit tests for your unit tests, and then... and then... you've crossed the event horizon.
just somebody
@just somebody: Yep, and that's often a sign of an overly-large class as well - one of the lessons you learn in TDD is that pain in testing usually is a signal of a design problem. And one of the values of unit tests is that they show you whether the test and the code agree - both the test and code being incorrect in the exact same way is unlikely.
kyoryu
@kyoryu: the single-assertion discipline puts quite a strain on the tests: either you keep the testing code "shallow" and it grows faster than the tested area of CUT or you engineer the tests so that their surface area grows at worst linearly with the CUT, but then their complexity grows at best faster than the CUT complexity.CUT class size has not much say in this. single-assertion tests tend to "explode" even with reall very small classes, all that is needed is that the class generates objects with rich states. single-ass tests seem to fit better with side-effect-free code, to a point.
just somebody
@John Baker: I don't know if you were asking about PHP in general, but there is somewhat of a thing... CxxTest (http://cxxtest.com/guide.html) for C++ testing uses Python or Perl to generate the test framework for you, although the actual test code is written in C++ (to link against the objects you're testing, of course).
Caleb Huitt - cjhuitt
@just_somebody: Agreed on side-effect free code. I'd actually argue that TDD and, specifically "unit tests" work best on side-effect free code, and in fact part of the value of them is that they push as much code as possible to not have side-effects. I see this as a good thing.
kyoryu
+3  A: 

When a test function performs only one test it is much easier to identify which case failed.

You also isolate the tests, so one test failing doesn't affect the execution of the other tests.

nos
+2  A: 

It seems like a single failure in a multi-test function would have to result in a failure for all, right? Generally test framework tests just pass fail, which with a multi-test method would mean you'd have to manually figure out which of the multiple tests would be failing, since if you're running a huge list of tests the first executed failure would result in an overall failure for the function and further tests wouldn't get to fail.

Granularity in tests is good. If you're going to write 5 tests, having them each in their own function seems no more difficult than having them all in the same location, apart from the minor overhead of creating new boilerplate function each time. With the right IDE, even that may be simpler than copying & pasting.

Tchalvak
<i>"you'd have to manually figure out which of the multiple tests would be failing"</i> This is normally very easy though: you can tell by the line number in the stack trace.
Jason Orendorff
*"creating a new boilerplate function each time [...] may be simpler than copying copying and pasting is not a good choice either way.
Jason Orendorff
Re: Jason, Re: stack trace, depends on what volume of tests you're running in your test suite, I suppose.Re: boilerplate: I'm saying (mainly as an aside) that creating a new test function might be even less work if you've an IDE that does repeatability for you in some ways.Anyway, community wiki-ed, feel free to correct/add to as desired. *shrugs*
Tchalvak
+6  A: 

High granularity of tests is recommended, not just for ease of identification of problems, but because sequencing tests inside a function can accidentally hide problems. Suppose for example that calling method foo with argument bar is supposed to return 23 -- but due to a bug in the way the object initializes its state, it returns 42 instead if it's called as the very first method on the newly constructed object (after that, it does correctly switch to returning 23). If your test of foo doesn't come right after the object's creation, you're going to miss this problem; and if you bunch tests up 5 at a time, you only have a 20% chance of accidentally getting it right. With one test per function (and a setup/teardown arrangement that resets and rebuilds everything cleanly each time, of course), you'll nail the bug immediately. Now this is an artificially-simple problem just for reasons of exposition, but the general issue -- that tests should not influence each other, but often will unless they're each bracketed by set up and tear down functionality -- does loom large.

Yes, naming things well (including tests) is not a trivial problem, but it must not be taken as an excuse to avoid proper granularity. A useful naming hint: each test checks for a given, specific behavior -- e.g., something like "Easter in 2008 falls on March 23" -- not for generic "functionality", such as "compute the Easter date correctly".

Alex Martelli
Hmmm, seems like based on that argument, you'd only have the reverse problem. If you have a bug where the first return is true, but further returns are buggy, then you'd never catch that with granular, one-off tests, neh?
Tchalvak
...except for the need to explicitly check that successive method calls influence each other the way they're supposed to (via the object state), including not influencing each other when they're not supposed to -- e.g. since you do probably need to check that two `foo` calls return the same value (if object state's supposed to be unaltered between them), then one of your tests will be two `foo` calls back to back -- as long as that test is in its own function, it will spot problems with either call (only if it follows other unrelated tests may problems be masked).
Alex Martelli
*nods* Fair enough.
Tchalvak
+1 Keeping tests simple is a golden advice. The last thing you want is a bug in the tests.
Mathias
Struggle with this at work all the time. Tests so complex that they need design documentation themselves. Drives me insane!
dkackman
+3  A: 

I think the good way is not to think in term of tests number per function but is to think in term of code coverage :

  • Function coverage - Has each function (or subroutine) in the
    program been called?
  • Statement coverage - Has each node in the program been
    executed?
  • Branch coverage - Has every edge in the program been
    executed?
  • Decision coverage - Has each control structure (such as an IF statement) evaluated both to true and false?
  • Condition coverage - Has each boolean sub-expression evaluated both to true and false? This does not necessarily imply decision coverage.
  • Condition/decision coverage - Both decision and condition coverage
    should be satisfied.

EDIT : I reread what I wrote and I found it kind of "scary" ... that remind me a good thought I heard some weeks a go about code coverage :

Code coverage is like stock market investment ! you need to invest enough time to have a good coverage but not too much to not waste your time and blow up your project !

wj
Does this mean you would say it's okay to stick most of your tests in one function?
John Baker
If you want to stick to the TDD principles you need to make small steps which implies : finer granularity and more tests ...
wj
I think it's perfectly fine to *prioritize* which tests you write first, though, focusing on the most important, core functionality before going for further coverage. Especially if you're working on a prototype where the innards may be rapidly changed for a while.
Tchalvak
+4  A: 

I'm having a hard time understanding why there is only one test per function in most professional TDD code that I have seen

I'm assuming that you mean 'assert' when you say 'test'. In general, a test should only test a single 'use case' of a function. By 'use case' I mean: a path that the code can flow through via control flow statements (don't forget about handled exceptions, etc.). Essentially you are testing all of the 'requirements' of that function. For example, say you have a function such as:

Public Function DoSomething(ByVal foo as Boolean) As Integer
   Dim result as integer = 0     

   If(foo) then
        result = MakeRequestToWebServiceA()
   Else
        result = MakeRequestToWebServiceB()
   End If     

   return result
End Function

In this case, there are 2 'use cases' or control flows that the function can take. This function should have at minimum 2 tests for it. One that accepts foo as true and branches down the if(true) code, and one that accepts foo as false and goes down the second branch. If you have more if statements or flows the code can go though, then it will require more tests. This is for several reason - the most important one to me is that without it, the tests would be too complicated and hard to read. There's other reasons too, like in the case of the above function, the control flow is based on input parameter - which means you must call the function twice to test all code paths. You should never call the function more then once that you are testing in your test IMO.

but I find myself struggling to come up with function names to differentiate the different tests since many are so similar

Maybe you are over-thinking it?? Don't be scared of writing crazy, overly verbose names for your test function. Whatever that test does, write it in english, use underscores, and come up with a set of standards for names so that someone else looking at the code (including yourself 6 months later) can easily figure out what it does. Remember, you never actually have to call this function yourself (at least in most testing frameworks), so who cares if the name of it is 100 characters. Go Crazy. In the above example, my 2 tests would be named:

 DoSomethingTest_TestWhenFooIsTrue_RequestIsMadeToWebServiceA()
 DoSomethingTest_TestWhenFooIsFalse_RequestIsMadeToWebServiceB()

Also - this is just a general guideline. There are definitely cases where you will have multiple asserts in the same unit test. This will happen when you are testing the same control flow, but multiple fields need to be checked when you write your assert statement(s). Take this for example - a test for a function which parses a CSV file into a business object which has a Header, a Body, and Footer field:

 Public Sub ParseFileTest_TestFileIsParsedCorrectly()
        Dim target as new FileParser()
        Dim actual as SomeBusinessObject = target.ParseFile(TestHelper.GetTestData("ParseFileTest.csv")))

        Assert.Equals(actual.Header,"EXPECTED HEADER FROM TEST DATA FILE")
        Assert.Equals(actual.Footer,"EXPECTED FOOTER FROM TEST DATA FILE")
        Assert.Equals(actual.Body,"TEST DATA BODY")
 End Sub

Here, we are really testing the same use case, but we needed multiple asserts to check all our data and make sure our code actually worked.

-Drew

dferraro
+6  A: 

Yes, you should test one behavior per function in TDD. Here's why.

  1. If you're writing your tests before you code, multiple behaviors tested in one function means you're implementing multiple behaviors at a single time, which is a bad iea.
  2. One behavior tested per function means that if a test fails, you know exactly why it failed, and can zero in on the specific problem area. If you have multiple behaviors tested in a single function, a failure in a "later" test may be due to an unreported failure in an earlier test causing bad state.
  3. One behavior tested per function means that if that behavior ever needs to be redefined, you only have to worry about the tests specific to that behavior, and not worry about other, unrelated tests (well, at least not due to the test layout...)

And, a final question - why not have one test per function? What is the benefit? I don't think there's a tax on function declarations.

kyoryu
A: 

Consider this straw man (in C#)

void FooTest()
{
    C c = new C();
    c.Foo();
    Assert(c.X == 7);
    Assert(c.Y == -7);
}

While "one assertion per test function" is good TDD advice, it's incomplete. Applying it alone would give:

void FooTestX()
{
    C c = new C();
    c.Foo();
    Assert(c.X == 7);
}

void FooTestY()
{
    C c = new C();
    c.Foo();
    Assert(c.X == 7);
}

It's missing two things: Once-and-only-once (aka DRY), and "one test class per scenario". The latter is the less-known one: instead of one test class / test fixture that holds all test methods, have nested classes for non-trivial scenarios. Like this:

class CTests
{
    class FooTests
    {
        readonly C c;

        void Setup()
        {
            c = new C();
            c.Foo();
        }

        void XTest()
        {
            Assert(c.X == 7);
        }

        void YTest()
        {
            Assert(c.Y == -7);
        }
    }
}

Now you don't have duplication, and each test method asserts exactly one thing about the code under test.

If it wasn't so verbose, I would consider writing all my tests this way, such that test methods are always trivial single-line methods with only an assertion. However, it seems too clumsy when a test doesn't share "setup" code with another test.

(I have avoided details that are specific to unit test technology, e.g. NUnit or MSTest. You will have to adjust to fit whatever you are using, but the principles are sound.)

Jay Bazuzi
What to do in these types of cases, when each test has subtly different set-ups? c = new C(a); c.Foo(); Assert(c.X == 7);c = new C(b); c.Foo(); Assert(c.X == -7);
dvide
@dvide: If it's that simple, then keep them separate. If they both have `c.Foo(); c.Bar(); c.Baz();" then I would consider this a code smell on `C`. Perhaps a new method `Bill()` that does `Foo/Bar/Baz`. Remember that unit tests are a legitimate client of your classes. Making your classes work well in tests is likely to make them work well in production, too.
Jay Bazuzi