I'm currently doing performance and load testing of a complex many-tier system investigating the effect of different changes, but I'm having problems keeping track of everything:
- There are many copies of different assemblies
- Orignally released assemblies
- Officially released hotfixes
- Assemblies that I've built containing further additional fixes
- Assemblies that I've build containing additional diagnostic logging or tracing
- There are many database patches, some of the above assemblies depend on certain database patches being applied
- Many different logging levels exist, in different tiers (Application logging, Application performance statistics, SQL server profiling)
- There are many different scenarios, sometimes it is useful to test only 1 scenario, other times I need to test combinations of different scenarios.
- Load may be split across multiple machines or only a single machine
- The data present in the database can change, for example some tests might be done with generated data, and then later with data taken from a live system.
- There is a massive amount of potential performance data to be collected after each test, for example:
- Many different types of application specific logging
- SQL Profiler traces
- Event logs
- DMVs
- Perfmon counters
- The database(s) are several Gb in size so where I would have used backups to revert to a previous state I tend to apply changes to whatever database is present after the last test, causing me to quickly loose track of things.
I collect as much information as I can about each test I do (the scenario tested, which patches are applied what data is in the database), but I still find myself having to repeat tests because of inconsistent results. For example I just did a test which I believed to be an exact duplicate of a test I ran a few months ago, however with updated data in the database. I know for a fact that the new data should cause a performance degregation, however the results show the opposite!
At the same time I find myself sepdning disproportionate amounts of time recording these all these details.
One thing I considered was using scripting to automate the collection of performance data etc..., but I wasnt sure this was such a good idea - not only is it time spent developing scripts instead of testing, but bugs in my scripts could cause me to loose track of things even quicker.
I'm after some advice / hints on how better to manage the test environment, in particular how to strike a balance between collecting everything and actually getting some testing done at the risk of missing something important?