Hi, We are facing one problem in managing test data(xmls which is used to create mock objects). The data which we have currently has been evolved over a long period of time. Each time we add a new functionality or test case we add new data to test that functionality. Now, the problem is when the business requirement changes the format( like length or format of a variable) or any change which the test data doesn't support , we need to change the entire test data which is 100s of MBs in size. Could anyone suggest a better method or process to overcome this problem? Any suggestion would be appreciated.
Personally, I would stay away from creating data for tests case anywhere other then within the test cases. Instead of creating test data, create data generators that allow for the quick generation of objects within each test case or within each before block.
This has two main advantages:
- It makes the tests much easier to read as the developer can see exactly what objects are being used, and
- It should greatly cut down on the amount of test data that you need to manage.
Reserve test data for things like functional and integration tests and use a tool like DBDeploy to manage that data. This data needs to be kept intentionally small. The use of DBDeploy and DBUnit allows for the database to be cleaned before each test or test suite. This should also limit the amount of data you need as it greatly increases data reuse.
While this is not a complete solution to your problem, but would definitely help (esp in your case since you have 100s of MBs of data) -- Write tests based on behavior verification instead of data verification.
Martin Fowler has a very good article here