tags:

views:

98

answers:

5

Recently I wrote a suite of unit tests that relied on a large set of test data. The set contained twelve elements and while this does not sound like a lot it was when used with the tests.

Each element required several properties to be set with unique vales. The issue was using this method was that the factory method that created this set of data was huge.

What are the best practices regards this issue? My application actually reads data in via a file but for tests I used mock data from an in memory store.

Any advice?

A: 

Couldn't you programatically create a subset of items from a controlled dataset of real production data? That's what we do, and if somehow the data model has changed, we have some scripts that update that real data to the new model before using it in the tests.

Sergi
Would you no in turn have to test his process? In other words Id need unit tests to test that I'm creating a valid list from the real data? The elements are of a graph - hence the need for them to be valid.
Finglas
+5  A: 

What do your tests look like?

Are you sure that you are writing unit tests and not higher level tests of multiple components of your code? A pure unit test should only be calling a single method, and that method will hopefully have limited calls to other methods (possibly via mocking).

By focusing on the smallest unit possible, you can write code to test specific edge cases. Whereas, if you are testing at a higher level, you will often have to write all types of permutations of edge-cases. Once you have all the smallests units covered, you can write some higher level integration tests to make sure that all those units are assembled correctly.

For example, if I had an application that reads in a CSV file of stock quotes and averages all the quotes for a given day, I would write several tests:

  • Unit tests around the CVS parsing
  • Unit tests around the date grouping
  • Unit tests around the averaging
  • Unit tests around the display of the answer
  • And a small number of integration tests that might take a very small CVS file and pass it through the entire process.

I apologize if I am making assumptions about your unit tests, but from my experience, I find that often what people call unit tests are not real unit tests and rather integration tests (or whatever you prefer to call them, e.g. functional tests, etc.). I am personally very guilty of writing tests that were too broad, and every time I now write tests I have to force myself to remember to really test a unit at a time.

John Paulett
Sounds right actually. One question though. I'm testing a search method. The contents of which uses several classes in turn unit tested. How do I confirm the search method works? Write an integration test or two? It's just to ensure I wire up the inside of the method correctly.
Finglas
@Dockers, that sounds perfectly acceptable. I think Integration tests still have an important place to make sure that all the units are wired correctly and play together well. I would probably test as much of the search method as possible using mocks, then write a few quick integration tests to make sure everything is wired properly. I don't mean to criticize your tests at all--it sounds like you have throughly tested your application and I definitely applaud that!
John Paulett
A: 

The method is large. It's horrible to manage and makes th test suite massive.

I have a separate program to generate the test data. The generated test data is stored on disk, version controlled, and available/used in unit tests. The size/complexity of this program (for example, it has its own UI) doesn't affect the size/complexity of the unit tests themselves.

This is a solution to "it's complicated to generate the data" (but I wouldn't recommend it for generating gigabytes of test data, which might well be better generated on-the-fly).

Also, I am doing this for integration tests (not unit tests).

ChrisW
A: 

I'd like to recommend xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros.

This case resembles the General Fixture smell:

Possible Solution We need to move to a Minimal Fixture to address this problem. This can best be done by using a Fresh Fixture for each test. If we must use a Shared Fixture we should consider applying the Make Resource Unique refactoring to create a virtual Database Sandbox for each test. (Note that switching to an Immutable Shared Fixture (see Shared Fixture) does not fully address the core of this problem since it does not help us determine which parts of the fixture are needed by each test; only the parts that are modified are so identified!)

Ewan Todd
A: 

How many test scenarios does this test data set support?

Ideally, your test data should be broken up so that there are separate test data sets for each scenario. Otherwise your test scenarios are indirectly dependent on each other, which is evil anyway.

In other words, having multiple scenarios share the same data set creates the possibility where modifying the shared data set for one scenario inadvertently makes the data incompatible with another scenario.

Doug Knesek