views:

444

answers:

8

I have been perpetually intrigued by test-driven development, but I can never follow through with it when I try it on real projects. I have a couple of philosophical questions that continually arise when I try it:

  1. How do you handle large changes? When it comes to testing single functions (some parameters, a result value, few side effects), TDD is a no-brainer. But what about when you need to thoroughly overhaul something large, e.g. switching from a SAX parsing library to a DOM parsing library? How do you keep to the test-code-refactor cycle when your code is in an intermediate state? Once you start making the change , you will get a bevy of failed tests until you've fully finished the overhaul (unless you maintain some kind of mongrel class that uses both DOM and SAX until you're done converting, but that's pretty weird). What happens to the small-step test-code-refactor cycle in this case? During this whole process you will no longer be moving in small, fully-tested steps. There must be some way people deal with this.
  2. When testing GUI or database code with mocks, what are you really testing? Mocks are built to return exactly the answer you want, so how do you know that your code will work with the real-world database? What is the benefit of automated tests for this kind of thing? It improves confidence somewhat, but a) it doesn't give you the same level of confidence that a complete unit test ought to, and b) to a certain extent, aren't you simply verifying that your assumptions work with your code rather than that your code works with the DB or GUI?

Can anyone point me to good case studies on using test-driven development in large projects? It's frustrating that I can basically only find TDD examples for single classes.

Thanks!

+2  A: 

When testing GUI or database code with mocks, what are you really testing? Mocks are built to return exactly the answer you want, so how do you know that your code will work with the real-world database? What is the benefit of automated tests for this kind of thing? It improves confidence somewhat, but a) it doesn't give you the same level of confidence that a complete unit test ought to, and b) to a certain extent, aren't you simply verifying that your assumptions work with your code rather than that your code works with the DB or GUI?

This is my approach: For database access layer (DAL), I don't use mock for my unit test. Instead, I run the tests on a real database, albeit a different one than the production database. So in this sense you can say that I don't run unit test on database. For NHibernate applications, I maintain two databases with same schema, but different database type (ORM makes this easy). I use sqlite for my automated testing, and a real MySQL or SQL server database for ad-hoc testing.

Only once did I use mock for unit testing the DAL; and that's when I was using strongly typed dataset as the ORM ( a big mistake!). The way I did this was to have Typemock returned me a mocked copy of the complete table so that I can perform select * on it. Later as I looked back I wished I never do this, but that was long time ago, and I wished I used a proper ORM.

As for the GUI, it is possible to unit test the GUI interaction. The way I did this was to use the MVP pattern to separate out the Model, View and Presenter. Actually for this type of application I only test on the Presenter and the Model, in which I use Typemock ( or dependency injection) to isolate the different layers so that at one time I can concentrate on only one layer. I don't test the view, but I do test Presenter ( where the majority of interaction and bugs are happening) a lot .

Ngu Soon Hui
+1  A: 

In terms of handling large chages... the purpose of TDD is to test the behaviors of your code and how it interacts with the services that it depends on. If you we wanting to use TDD and you were moving from a DOM parser to a SAX parser and you were writing the sax parser yourself then you would write tests that verified the behavior of the SAX parser based on a known input ie an XML document. The SAX parser may depend on a collection of helper objects that could in fact be mocked out initially for the purposes of testing the behavior of the SAX parser. When you were ready to write the implementation code for the helper objects you could then write tests around their expected behavior based on a known input. In the example of the SAX parser you would write seperate classes to implement this behavior so as to not interfere with the existing code you have that depends on the DOM Parser. In fact what you could to is create an IXMLParser interface that DOM parser and the SAX parser implement so that you could switch them out at will.

As far as using mocks or stub are concerned the reason you use a Mock or a Stub is that you are not interested in testing the inner workings of the Mock or the Stub, but you are interested in testing the inner workings of what depend on the mock or the stub and that is what you are truely testing from a unit perspective. If you are interested in writing integration tests then you should write integration tests and not unit tests. I find writing code in a TDD fashion is useful for helping me define the structure and organization of my code around the behavior I am being asked to provide.

I am not familiar with any case studies off-hand, but I am sure they are out there.

Michael Mann
A: 

As for the database angle, as Ngu Soon Hui mentioned, you should (IMHO) use something like DBUnit, which will set up the database in a known configuration (so you can test for expected results) but it is using the real database that the real application will use.

For large changes, I would recommend creating a branch, and allowing the tests to fail. This will give you a TODO list of areas that need to be changed, and it could be argued that this is where TDD really shines, even more than with the small, isolated functions.

pkaeding
+1  A: 

When testing GUI or database code with mocks, what are you really testing?

I usually try to separate my business logic, display logic, and database access. Most of my GUI unit tests deal with business logic. Here's a pseudocode example:

// Production code in class UserFormController:

void changeUserNameButtonClicked() {
  String newName = nameTextBox.getText();
  if (StringUtils.isEmpty(newName)) {
    errorBox.showError("User name may not be empty !");
  } else {
    User user = engine.getCurrentUser();
    user.name = newName;
    engine.saveUser(user);
  }
}

// Test code in UserFormControllerTest:

void testValidUserNameChange() {
  nameTextBox = createMock(TextBox.class);
  expect(nameTextBox.getText()).andReturn("fred");
  engine = createMock(Engine.class);
  User user = createMock(user);
  user.setName("fred");
  expectLastCall();
  expect(engine.getCurrentUser()).andReturn(user);
  engine.saveUser(user);
  expectLastCall();
  replay(user, engine, nameTextBox);

  UserFormController controller = new UserFormController();
  controller.setNameTextBox(nameTextBox);
  controller.setEngine(engine);
  controller.changeUserNameButtonClicked();  

  verify(user, engine, nameTextBox);
}

void testEmptyUserNameChange() {
  nameTextBox = createMock(TextBox.class);
  errorBox = createMock(ErrorBox.class);
  expect(nameTextBox.getText()).andReturn("");
  errorBox.showError("User name may not be empty !");
  expectLastCall();
  replay(nameTextBox, errorBox);

  UserFormController controller = new UserFormController();
  controller.setNameTextBox(nameTextBox);
  controller.setErrorBox(errorBox);
  controller.changeUserNameButtonClicked();  

  verify(nameTextBox, errorBox);
}

This ensures that, regardless of how broken my database and GUI code may be, at least the logic that controls the user name change works correctly. If you organize your GUI code into a set of individual controls (or widgets or form elements or whatever they're called in your GUI framework), you can test them in a similar way.

But ultimately, like you said, these unit tests won't give you the whole picture. To get that, you need to do what others have suggested: create a real database, with a "golden set" of data, and run integration/functional tests against it. But, IMO, such tests are out of scope for TDD, because setting them up is usually pretty time-consuming.

Bugmaster
A: 

Handling Large Changes

In my experience, these are relatively infrequent. When they do happen, updating the tests is a minor hassle. The trick is to pick the right granularity for the tests. If you test the public interface, updates will go quickly. If you test the private implementation code, changing from a SAX to a DOM parser will sax big time and you'll feel dom. ;-)

Testing GUI Code

In general I don't. I keep my UI layer as thin as possible. The idea is to test what might break.

Testing Database Code

When possible I prefer to place data-access code behind interfaces and mock that out when testing the business logic. As others have mentioned, at some point you may want to run integration tests against the DAL to insure that it works against a test database in a known state. You may want other integration tests of the entire system; having layers of different kinds of tests is a good thing. TDD is primarily about design, and does not eliminate the need for integration or acceptance tests.

It's very possible to abuse mocks and stubs, writing tests that do nothing but test mock objects. It takes a lot of experience to write good tests; I'm still learning.

My suggestion would be to keep practicing TDD, perhaps on smaller projects initially. Read as much as you can on it, talk to other practitioners, and use what works for you.

Continuous Integration really helps with testing, as it insures that tests are run and makes broken tests visible. Highly recommended.

EDIT: To be frank, I have trouble decoupling the data-access code in many cases and end up using test deck databases. Even integration tests like these have proved valuable, although they're slower and more fragile. As I said, I'm still learning.

TrueWill
+1  A: 
  1. How do you handle large changes?

    • Step-by-step. I have worked on quite a few non-trivial programs, and have always been able to break things down into small changes (requiring hours and maybe days). For example, rewriting a 30Mpv website broke down into doing one page at a time-- This was moving from one language to another, writing (small) tests as we went, keeping the site up with frequent deployments. Another project we converted a GUI web app into a headless back-end server. This involved many small steps through a month or two of work, and then eventually discarding much of the web code. But we were able to keep all the tests working as we went. We did this not because we were trying to prove something, but because it was the best way to re-use the code and tests.

    • Bigger steps can be aided by tests with wider scope. For example, you SAX->DOM example would have a high-level integration test that would verify the ultimate behavior. When I did something similar, though, I wrote much smaller behavior tests around the different types of node processing, and converting these could be done one-by-one.

  2. When testing GUI or database code with mocks, what are you really testing?

    • You always need to make sure you are writing valuable tests. This can be hard. It's easy, even if you're thinking, to write some pretty redundant tests.
    • Mocks don't make sense when you are trying to test database queries. They are useful when you are trying to "mock out" a layer below what you are testing... so they are useful in a controller test where you will mock out the behavior of the service layer-- which you will test independently. For testing database queries, you need to load up the database with appropriate fixtures that will test what you are trying to do. This can be done with fixtures or careful test set-up code. This takes some thought to get right, so it's good to have a well-designed set of fixture data that will enable you to write good database query tests, covering as many important cases as possible.

    • Yes, you are verifying your assumptions with mocks-- but you also have to test those assumptions separately. The alternative-- to test them all together-- is fine, but more brittle. It means a test is testing more code, and therefore can break more easily.

ndp
+1  A: 

My 2 cents...

  1. if your tests break because you switched the type of XML parser - it indicates that the tests are fragile. The tests should specify the what and not the how. Which implies that in this case, the tests somehow know that you're using a SAX parsing engine (an implementation detail); which they should not. Fix that problem and you should be better with large changes.
  2. When you're abstracting away GUIs or Mocks from tests via an interface, you're ensuring that your test subject which uses the mocks (as doubles for actual collaborators) works as intended. You get to isolate bugs in your code from bugs in your collaborators. Mocks help you keep your test-suite fast. You also need tests that verify that your real collaborator also conforms to the interface AND tests that your real collaborators are 'wired-up' correctly..
Gishu
+4  A: 

How do you handle large changes?

As small as needed.

Sometimes refactorings have a big surface but are trivial in detail. These can be done in quite big steps. Putting too much effort in trying to break them down will be waste.

I would argue that a XML library change is in this category. You're putting XML in and get some representation out. As long as your representation does not change (from a graph representing the state to a event stream) the library switch is easy.

Most of the time refactorings are not trivial and have to be broken down. The problem is when to do bigger steps and when the smaller ones. My observation is that I'm quite bad at estimating the impact of a change. Most software is complicated enough that you may think they change is easily manageable but then there is the whole fine print that has to work again. So I do start with an amount of change. But I'm prepared to rollback everything if it starts to get unpredictable. I would say this happens in one out of ten refactorings. But this one will be hard. You have to track down the part of the system that does not behave as you expect. The problem has to be split now in multiple smaller problems. I do solve one problem at a time and check in when it's done. (Multiple iterations of revert and splitting are not uncommon.)

If you change the XML parser and representation in your code this should be definitely be at least two separate refactorings.

Mock testing

You're testing a communication protocol between objects/layers with mock objects.

The whole mock approach can be though of as a communication model like the OSI model. When layer X gets as call with parameter x it will call layer Z with parameters a and b. Your test specifies this communication protocol.

As useful as mock test can be, test as few functionality with them as possible. The best option are state-based tests: setup fixture, call system under test, check state of system under test and pure functional (as in functional programming) tests: call with x returns a.

Try to design your system in a way that most of its functionality is loosely coupled. Some of the functionality has to be tested with mock tests (a fully decoupled system is useless).

Integration tests are no option to test your system. They should only be used to test aspects of the system that can break with the integration of multiple units. If you try to test your system with integration tests you'll enter into the permutation casino.

So your strategy for GUI testing should be clear. The parts of the GUI code that cannot be tested in isolation should be tested with a mock tests (when this button is pressed service X is called with parameter y).

Databases muddy the water a bit. You cannot mock a database, unless you're going to reimplement the behavior of every database you would like to support. But this is not a unit test as you're integrating an external system. I've made peace with this conceptional problem and think of the DAO and database as one inseparable unit that can be tested with a state-based test approach. (Sadly, this unit behaves differently when it has its oracle day compared to its mysql day. And it may break in the middle and tell you that it cannot talk to itself.)

Thomas Jung