views:

1536

answers:

8

Recently I had an interesting discussion with a colleague about unit tests. We were discussing when maintaining unit tests became less productive, when your contracts change.

Perhaps anyone can enlight me how to approach this problem. Let me elaborate:

So lets say there is a class which does some nifty calculations. The contract says that it should calculate a number, or it returns -1 when it fails for some reason.

I have contract tests who test that. And in all my other tests I stub this nifty calculator thingy.

So now I change the contract, whenever it cannot calculate it will throw a CannotCalculateException.

My contract tests will fail, and I will fix them accordingly. But, all my mocked/stubbed objects will still use the old contract rules. These tests will succeed, while they should not!

The question that rises, is that with this faith in unit testing, how much faith can be placed in such changes... The unit tests succeed, but bugs will occur when testing the application. The tests using this calculator will need to be fixed, which costs time and may even be stubbed/mocked a lot of times...

How do you think about this case? I never thought about it thourougly. In my opinion, these changes to unit tests would be acceptable. If I do not use unit tests, I would also see such bugs arise within test phase (by testers). Yet I am not confident enough to point out what will cost more time (or less).

Any thoughts?

+5  A: 

Unit tests surely can not catch all bugs, even in the ideal case of 100% code / functionality coverage. I think that is not to be expected.

If the tested contract changes, I (the developer) should use my brains to update all code (including test code!) accordingly. If I fail to update some mocks which therefore still produce the old behaviour, that is my fault, not of the unit tests.

It is similar to the case when I fix a bug and produce a unit test for, but I fail to think through (and test) all similar cases, some of which later turns out to be buggy as well.

So yes, unit tests need maintenance just as well as the production code itself. Without maintenance, they decay and rot.

Péter Török
I totally agree that the one who is responsible, is the developer. But the whole idea with unit tests is also that you get notified when something breaks. This time, you don't get notified, because you mock/stub the class. I am not sure, but I would be helped if somehow I could get notified what classes are affected (mock/stub) wise if i change my contract.
Stefan Hendriks
@Stefan, I believe with a bit of thinking, it is possible to build some safety checks into the code. E.g. an assert that the value returned from the function is never -1 after the contract change. That would fail with old mocks.
Péter Török
A: 

I look at it this way, when your contract changes, you should treat it like a new contract. Therefore, you should create a whole new set of UNIT test for this "new" contract. The fact that you have an existing set of test cases is besides the point.

Syd
Yes, but what would you do with all those unit tests that are using your class and are mocking/stubbing it? Do you rewrite them as well? That would be probably more expensive than fixing them all.
Stefan Hendriks
If you versonalised your tests, then you would have to create new stubs and mocks. There is no two way about it. However, if you are spending a lot of time writing your stubs, then you have to rethink your strategy. Maybe you should "data driven" your stubs instead.
Syd
+10  A: 

It's better to have to fix unit test that fail due to intentional code changes than not having tests to catch the bugs that are eventually introduced by these changes.

When your codebase has a good unit test coverage, you may run into many unit test failures that are not due to bugs in the code but intentional changes on the contracts or code refactoring.

However, that unit test coverage will also give you confidence to refactor the code and implement any contract changes. Some test will fail and will need to be fixed, but other tests will eventually fail due to bugs that you introduced with these changes.

Bruno Rothgiesser
I am convinced about using unit tests. But at a certain point unit tests may cost you more than you get a benefit from. I believe, when that point comes, you perhaps need to throw away tests and start with new ones.Code coverage is an indicator to know what you not cover, not what you really do cover. (it is a negative metric in that sense ;))
Stefan Hendriks
Yes, as part of the test harness maintenance it's expected that you throw away tests that are no longer valid and write new ones. Sometimes it's worth rewriting tests from scratch if the code that they tested changed much.
Bruno Rothgiesser
@Stefan: great insight about code coverage!
David Schmitt
+3  A: 

I have similar experiences with unit tests - when you change the contract of one class often you need to change loads of other tests as well (which will actually pass in many cases, which makes it even more difficult). That is why I always use higher level tests as well:

  1. Acceptance tests - test a couple or more classes. These tests are usually aligned to user stores that need to be implemented - so you test that the user story "works". These don't need to connect to a DB or other external systems, but may.
  2. Integration tests - mainly to check external system connectivity, etc.
  3. Full end-to-end tests - test the whole system

Please note that even if you have 100% unit test coverage, you are not even guaranteed that your application starts! That is why you need higher level tests. There are so many different layers of tests because the lower you test something, the cheaper it usually is (in terms of development, maintaining test infrastructure as well as execution time).

As a side note - because of the problem you mentioned using unit tests teaches you to keep your components as decoupled as possible and their contracts as small as possible - which is definitely a good practise!

Grzenio
Good answer, I agree with the levels of tests. I do however wonder when the maintanance of tests are exceeding a certain point where it becomes too expensive? Or is this maintenance cost just 'part of the whole' ?
Stefan Hendriks
@Stefan, I don't think there's a point where the maintanance becomes too expensive. If many of your tests are failing often, it's because you're changing your code often. If you change your code often, you need the tests to catch as many new bugs introduced by your changes as possible.
Bruno Rothgiesser
+2  A: 

One of the rules for unit tests code (and all other code used for testing) is to treat it the same way as production code - no more, no less - just the same.

My understanding of this is that (beside keeping it relevant, refactored, working etc. like production code) it should be looked at it the same way from the investment/cost prospective as well.

Probably your testing strategy should include something to address the problem you have described in the initial post - something along the lines specifying what test code (including stubs/mocks) should be reviewed (executed, inspected, modified, fixed etc) when a designer change a function/method in production code. Therefore the cost of any production code change must include the cost of doing this - if not - the test code will become "third-class citizen" and the designers' confidence in the unit test suite as well as its relevance will decrease... Obviously, the ROI is in the timing of bugs discovery and fix.

ratkok
+73  A: 

The first issue you raise is the so-called "fragile test" problem. You make a change to your application, and hundreds of tests break because of that change. When this happens, you have a design problem. Your tests have been designed to be fragile. They have not been sufficiently decoupled from the production code. The solution is (as it it in all software problems like this) to find an abstraction that decouples the tests from the production code in such a way that the volatility of the production code is hidden from the tests.

Some simple things that cause this kind of fragility are:

  • Testing for strings that are displayed. Such strings are volatile because their grammar or spelling may change at the whim of an analyst.
  • Testing for discrete values (e.g. 3) that should be encoded behind an abstraction (e.g. FULL_TIME).
  • Calling the same API from many tests. You should wrap the API call in a test function so that when the API changes you can make the change in one place.

Test design is an important issue that is often neglected by TDD beginners. This often results in fragile tests, which then leads the novices to reject TDD as "unproductive".

The second issue you raised was false positives. You have used so many mocks that none of your tests actually test the integrated system. While testing independent units is a good thing, it is also important to test partial and whole integrations of the system. TDD is not just about unit tests.

Tests should be arranged as follows:

  • Unit tests provide close to 100% code coverage. They test independent units. They are written by programmers using the programming language of the system.
  • Component tests cover ~50% of the system. They are written by business analysts and QA. They are written in a language like FitNesse, Selenium, Cucumber, etc. They test whole components, not individual units. They test primarily happy path cases and some highly visible unhappy path cases.
  • Integration tests cover ~20% of the system. They tests small assemblies of components as opposed to the whole system. Also written in FitNesse/Selenium/Cucumber etc. Written by architects.
  • System tests cover ~10% of the system. They test the whole system integrated together. Again they are written in FitNesse/Selenium/Cucumber etc. Written by architects.
  • Exploratory manual tests. (See James Bach) These tests are manual but not scripted. They employ human ingenuity and creativity.
Robert C. Martin
What overall test methodology would you recommend? TDD covers as far as the development process, but should not the project as a whole incorporate a test methodology covering for other phases as well?
Dipper
Bob, I agree with everything you said, but still, the fact is we have test design failures and need efficient ways of fixing it. It takes years (as you surely know) for this kind of mistakes to become rare. This is what Gary Bernhardt and J.B Rainsberger talk about here http://bit.ly/bBX6qW , and me here http://bit.ly/cuuihY (shameless plug).
abyx
Someone asked on twitter how to wrap api calls in a test function. Imagine you have this api: void api(int n); You can wrap this in a test function as follows: void tapi(int n) {api(n);}. Later, if api changes to void api(double n) throws X; then only tapi is broken, and it can be fixed simply without affecting all the existing tests. void tapi(int n) {try {api((double)n);}catch (X x) {...}}
Robert C. Martin
Bob, thanks a lot for your answer. In this case I do have a tests that test the contract (so i have a tapi test). And it breaks, and I fix it. However, due fragmentation (as you point out) other classes that use my calculator will still succeed. For now I do a search/find all within my IDE to know what tests are affected. I will evaluate them manually to see if the mock/stub still behaves as the contract describes. Thx for the enlightment.
Stefan Hendriks
I think you actually missed one thing in the original question: When the same kind of mock gets created over and over again in a test suite it really should get factored out into a single place. After all code duplication in tests is still code duplication and is still bad.
Jens Schauder
Ah I see. I am used to EasyMock, and when stubbing I most of the time extend the original class overriding one method. Most of the time you only want to override one method and give a different outcome. But in some cases you do have several methods returning the same outcome, and only one of those methods returning a different outcome. Perhaps it could be abstracted into a Stub class that is used at several places, reducing duplication. I'll toy with that.
Stefan Hendriks
In fact, this sentence says it all i think: "Test design is an important issue that is often neglected by TDD beginners. This often results in fragile tests, which then leads the novices to reject TDD as "unproductive"."I consider myself a beginner at TDD :)
Stefan Hendriks
Did you really see "They are written by business analysts and QA" or "Written by architects"? it sounds good in theory, but QA are .. they think its too complicated for them and others think QA should write it and dont want to be testers.
01
+1  A: 

One principle that I rely on here is removing duplication. I generally don't have many different fakes or mocks implementing this contract (I use more fakes than mocks partly for this reason). When I change the contract it is natural to inspect every implementation of that contract, production code or test. It bugs me when I find I'm making this kind of change, my abstractions should have been better thought out perhaps etc, but if the test codes is too onerous to change for the scale of the contract change then I have to ask myself if these also are due some refactoring.

Brett
A: 

I use Contract Tests to solve this problem. http://bit.ly/b8y1s8

J. B. Rainsberger
Can you expand on that answer a bit rather than linking to your blog?
Andrew Grimm
I don't want to duplicate content.
J. B. Rainsberger