views:

348

answers:

4

In bigger projects my unit tests usually require some "dummy" (sample) data to run with. Some default customers, users, etc. I was wondering how your setup looks like.

  1. How do you organize/maintain this data?
  2. How do you apply it to your unit tests (any automation tool)?
  3. Do you actually require test data or do you think it's useless?

My current solution:

I differentiate between Master data and Sample data where the former will be available when the system goes into production (installed for the first time) and the latter are typical use cases I require for my tests to run (and to play during development).

I store all this in an Excel file (because it's so damn easy to maintain) where each worksheet contains a specific entity (e.g. users, customers, etc.) and is flagged either master or sample.

I have 2 test cases which I (miss)use to import the necessary data:

  1. InitForDevelopment (Create Schema, Import Master data, Import Sample data)
  2. InitForProduction (Create Schema, Import Master data)
+3  A: 

I use the repository pattern and have a dummy repository that's instantiated by the unit tests in question, it provides a known set of data that encompasses a examples that are both within and out of range for various fields.

This means that I can test my code unchanged by supplying the instantiated repository from the test unit for testing or the production repository at runtime (via a dependency injection (Castle)).

I don't know of a good web reference for this but I learnt much from Steven Sanderson's Professional ASP.NET MVC 1.0 book published by Apress. The MVC approach naturally provides the separation of concern that's necessary to allow your testing to operate with fewer dependencies.

The basic elements are that you repository implements an interface for data access, that same interface is then implemented by a fake repository that you construct in your test project.

In my current project I have an interface thus:

namespace myProject.Abstract
{
    public interface ISeriesRepository
    {
        IQueryable<Series> Series { get; }
    }
}

This is implemented as both my live data repository (using Linq to SQL) and also a fake repository thus:

namespace myProject.Tests.Respository
{
    class FakeRepository : ISeriesRepository
    {
        private static IQueryable<Series> fakeSeries = new List<Series> {
            new Series { id = 1, name = "Series1", openingDate = new DateTime(2001,1,1) },
            new Series { id = 2, name = "Series2", openingDate = new DateTime(2002,1,30),
            ...
            new Series { id = 10, name = "Series10", openingDate = new DateTime(2001,5,5)
        }.AsQueryable();

        public IQueryable<Series> Series
        {
            get { return fakeSeries; }
        }
    }
}

Then the class that's consuming the data is instantiated passing the repository reference to the constructor:

namespace myProject
{
    public class SeriesProcessor
    {
        private ISeriesRepository seriesRepository;

        public void SeriesProcessor(ISeriesRepository seriesRepository)
        {
            this.seriesRepository = seriesRepository;
        }

        public IQueryable<Series> GetCurrentSeries()
        {
            return from s in seriesRepository.Series
                   where s.openingDate.Date <= DateTime.Now.Date
                   select s;
        }
    }
}

Then in my tests I can approach it thus:

namespace myProject.Tests
{
    [TestClass]
    public class SeriesTests
    {
        [TestMethod]
        public void Meaningful_Test_Name()
        {
            // Arrange
            SeriesProcessor processor = new SeriesProcessor(new FakeRepository());

            // Act
            IQueryable<Series> currentSeries = processor.GetCurrentSeries();

            // Assert
            Assert.AreEqual(currentSeries.Count(), 10);
        }

    }
}

Then look at CastleWindsor for the inversion of control approach for your live project to allow your production code to automatically instantiate your live repository through dependency injection. That should get you closer to where you need to be.

Lazarus
sounds like an interesting approach, but somehow I do not fully understand it. Do you have any references on the web for this approach?
Michal
+1  A: 

In our company we discuss exact these problem a bunch of time since weeks and month.

To follow the guideline of unit testing:

Each test must be atomar and don't allow relate to each other (No data sharing), that means, each tust must be have there own data at the beginning and clear the data at end.

Out product is so complex (5 years development, over 100 tables in a database), that is nearly impossible to maintain this in a acceptable way.

We tried out database scripts, which creates and deletes the data before / after the test (there are automatic methods which call it).

I would say you are on a good way with excel files.

Ideas from me to make it a little well:

  • If you have a database behind your software google for "NDBUnit". It's a framework to insert and delete data in databases for unit tests.
  • If you have no database maybe XML is a little more flexible on systems like excel.
Kovu
+1  A: 

Not directly answering the question but one way to limit the amount of tests that need to use dummy data is to use a mocking framework to create mocked objects that you can use to fake the behavior of any dependencies you have in a class.

I find that using mocked objects rather then a specific concrete implementation you can drastically reduce the amount of real data you need to use as mocks don't process the data you pass into them. They just perform exactly as you want them to.

I'm still sure you probably need dummy data in a lot of instances so apologies if you're already using or are aware of mocking frameworks.

James Hay
A: 

Just to be clear, you need to differenciate between UNIT testing (test a module with no implied dependencies on other modules) and app testing (test parts of application).

For the former, you need a mocking framework (I'm only familiar with Perl ones, but i'm sure they exist in Java/C#). A sign of a good framework would be ability to take a running app, RECORD all the method calls/returns, and then mock the selected methods (e.g. the ones you are not testing in this specific unit test) using recorded data. For good unit tests you MUST mock every external dependency - e.g., no calls to filesystem, no calls to DB or other data access layers unless that is what you are testing, etc...

For the latter, the same mocking framework is useful, plus ability to create test data sets (that can be reset for each test). The data to be loaded for the tests can reside in any offline storage that you can load from - BCP files for Sybase DB data, XML, whatever tickles your fancy. We use both BCP and XML.

Please note that this sort of "load test data into DB" testing is SIGNIFICANTLY easier if your overall company framework allows - or rather enforces - a "What is the real DB table name for this table alias" API. That way, you can cause your application to look at cloned "test" DB tables instead of real ones during testing - on top of such table aliasing API's main purpose of enabling one to move DB tables from one database to another.

DVK