Our software manages a lot of data feeds from various sources: real time replicated databases, files FTPed automatically, scheduled running of database stored procedures to cache snapshots of data from linked servers and numerous other methods of acquiring data.
We need to verify and validate this data:
- has an import even happened
- is the data reasonable (null values, number of rows, etc.)
- does the data reconcile with other values (perhaps we have multiple sources for similar data)
- is it out of data and the import needs manually prompting
In many ways this is like Unit Testing: there are many types of check to make, just add a new check to the list and just re-run each class of test in response to a particular event. There are already nice GUIs for running tests, perhaps even being able to schedule them.
Is this a good approach? Are there better, similarly generalised, patterns for data validation?
We're a .NET shop, would Windows Workflow (WF) be a better more flexible solution?