views:

881

answers:

11

If you (or your organization) aspires to thoroughly unit test your code, how do you measure the success or quality of your efforts?

  • Do you use code coverage, what percentage do you aim for?
  • Do you find that philosophies like TDD have a better impact than metrics?
+5  A: 

If it can break, it should be tested. If it can be tested, it should be automated.

ironfroggy
+1 for automation.
Lucas B
+5  A: 

Code coverage is to testing as testing is to programming. It can only tell you when there is a problem, it can't tell you when everything works. You should have 100% code coverage and beyond. Branches of code logic should be tested with several input values, fully exercising normal, edge, and corner cases.

Bill the Lizard
100% code coverage... and beyond? :-P
asterite
Oh, I stopped reading. Now I get it...
asterite
Right. Hitting 100% code coverage doesn't mean you're done testing. There could still be a lot wrong with your code.
Bill the Lizard
I realise now that 100% code coverage doesn't mean you're done testing, it just means that code coverage has ceased to be any further use. Its like the compiler saying 'Yup, that compiles', you know that's not the end of the story.
quamrana
+5  A: 

I normally do TDD, so I write the tests first, which helps me see how I want to be able to use the objects.

Then, when I'm writing the classes, for the most part I can spot common pitfalls (i.e. assumptions that I'm making, e.g. a variable being of a particular type, or range of values) and when these come up I write a specific test for that specific case.

Aside from that, and getting as good as code coverage as possible (sometimes it's not possible to get 100%), you're more or less done. Then, if any bugs do come up in the future, you just make sure you write a test case for it that exposes it first, and will pass when fixed. Then fix as per normal.

JamShady
When is it not possible to get 100% code coverage? Is it simply a time constraint?
Bill the Lizard
@Bill, sometimes it's very difficult to mock everything, and in those cases it may be impossible to test some bits of code anywhere but in a live production environment.
Wedge
+9  A: 

Code coverage is a useful metric but should be used carefully. Some people take code coverage, specially the percentage covered, a bit too seriously and see it as THE metric for good unit testing.

My experience tells me that more important than trying to get 100% coverage, which is not that easy, people should focus on checking the critical sections are covered. But even then you may get false positives.

t3mujin
+21  A: 

My tip is not a way to determine whether you have good unit tests per se, but it's a way to grow a good test suite over time.

Whenever you encounter a bug, either in your development or reported by someone else, fix it twice. You first create a unit test that reproduces the problem. When you've got a failing test, then you go and fix the problem.

If a problem was there in the first place it's a hint about a subtlety about the code or the domain. Adding a test for it lets you make sure it's never going to be reintroduced in the future.

Another interesting aspect about this approach is that it'll help you understand the problem from a higher level before you actually go and look at the intricacies of the code.

Also, +1 for the value and pitfalls of test coverage already mentioned by others.

webmat
Joel phrases this as fix twice, I always liked that philosophy. +1
vfilby
+5  A: 

To attain a full measure of confidence in your code you need different levels of testing: unit, integration and functional. I agree with the advice given above that states that testing should be automated (continuous integration) and that unit testing should cover all branches with a variety of edge case datasets. Code coverage tools (e.g. Cobertura, Clover, EMMA etc) can identify holes in your branches, but not in the quality of your test datasets. Static code analysis such as FindBugs, PMD, CPD can identify problem areas in your code before they become an issue and go a long way towards promoting better development practices.

Testing should attempt to replicate the overall environment that the application will be running in as much as possible. It should start from the simplest possible case (unit) to the most complex (functional). In the case of a web application, getting an automated process to run through all the use cases of your website with a variety of browsers is a must so something like SeleniumRC should be in your toolkit.

However, software exists to meet a business need so there is also testing against requirements. This tends to be more of a manual process based on functional (web) tests. Essentially, you'll need to build a traceability matrix against each requirement in the specification and the corresponding functional test. As functional tests are created they are matched up against one or more requirements (e.g. Login as Fred, update account details for password, logout again). This addresses the issue of whether or not the deliverable matches the needs of the business.

Overall, I would advocate a test driven development approach based on some flavour of automated unit testing (JUnit, nUnit etc). For integration testing I would recommend having a test database that is automatically populated at each build with a known dataset that illustrates common use cases but allows for other tests to build on. For functional testing you'll need some kind of user interface robot (SeleniumRC for web, Abbot for Swing etc). Metrics about each can easily be gathered during the build process and displayed on the CI server (eg Hudson) for all developers to see.

Gary Rowe
+6  A: 

I am very much pro-TDD, but I don't place much importance in coverage stats. To me, the success and usefulness of unit tests is felt over a period of development time by the development team, as the tests (a) uncover bugs up front, (b) enable refactoring and change without regression, (c) help flesh out modular, decoupled design, (d) and whatnot.

Or, as Martin Fowler put it, the anecdotal evidence in support of unit tests and TDD is overwhelming, but you cannot measure productivity. Read more on his bliki here: http://www.martinfowler.com/bliki/CannotMeasureProductivity.html

Scott Bale
+4  A: 

If your primary way of measuring test quality is some automated metric, you've already failed.

Metrics can be misleading, and they can be gamed. And if the metric is the primary (or worse yet, only) means of judging quality they will be gamed (perhaps unintentionally).

Code coverage, for example, is deeply misleading because 100% code coverage is nowhere near complete test coverage. Also, a figure like "80% code coverage" is just as misleading without context. If that coverage is in the most complex bits of code and just misses the code which is so simple it's easy to verify by eye then that's significantly better than if that coverage is biased in the reverse way.

Also, it's important to distinguish between the test-domain of a test (it's feature-set, essentially) and its quality. Test quality is not determined by how much it tests just as code quality isn't determined by a laundry list of features. Test quality is determined by how well a test does its job in testing. That's actually very difficult to sum up in an automated metric.

The next time you go to write a unit test, try this experiment. See how many different ways you can write it such that it has the same code coverage and tests the same code. See whether its possible to write a very poor test that meets these criteria and a very good test as well. I think you may be surprised at the results.

Ultimately there's no substitute for experience and judgment. A human eye, hopefully several eyes, needs to look at the test code and decide if it's good or not.

Wedge
+2  A: 

I think some best practices for unit tests are:

  • They must be self-contained, i.e. not require too much configuration and external dependencies to run. Let tests build their own dependencies like files and Web sites required for the tests to run.
  • Use unit tests to reproduce bugs before fixing them. This helps prevent the bugs from surfacing again in the future.
  • Use a code coverage tool to spot critical code that is not exercised by any unit tests.
  • Integrate unit tests with nightly builds and release builds.
  • Publish test result reports and code coverage reports to a Web site where everyone in the team can browse them. The publishing should ideally be automated and integrated into the build system.

Do not expect to reach 100% code coverage unless you develop mission critical software. It can be very costly to reach this level and will for most projects not be worth the effort.

Lars Fastrup
+3  A: 

Monitoring code coverage rates can be useful but instead of focusing on an arbitrary target rate (80 %, 90 %, 100 % ?) I have found it useful to aim for a positive trend over time.

Steven Hale
I like this suggestion, thank you.
vfilby
+1  A: 

An additional technique I try to use is to partition your code into two parts. I've recently blogged about it here. The short description is to maintain your production code in two sets of libraries where one set (hopefully the larger set) has 100% line coverage (or better if you can measure it) and the other set (hopefully a tiny amount of code) has 0% coverage, yes zero percent coverage.

Your designs should allow this partitioning. This should make it easy to see the code that is not covered. Over time you may have ideas about how to move code from the smaller set to the larger set.

quamrana
It sounds like what you are saying is, "Test important things first." I am not sure I agree with the 100%/0% division. I do agree with testing key components first then working outwards though.
vfilby
That's OK, you don't have to agree with the 100%/0% thing. However, you might like to try it and see what happens.
quamrana