views:

52

answers:

3

Part of my work involves creating reports and data from SQL Server to be used as information for decision. The majority of the data is aggregated, like inventory, sales and costs totals from departments, and other dimensions.

When I am creating the reports, and more specifically, I am developing the SELECTs to extract the aggregated data from the OLTP database, I worry about mistaking a JOIN or a GROUP BY, for example, returning incorrect results.

I try to use some "best practices" to prevent me for "generating" wrong numbers:

  • When creating an aggregated data set, always explode this data set without the aggregation and look for any obvious error.
  • Export the exploded data set to Excel and compare the SUM(), AVG(), etc, from SQL Server and Excel.
  • Involve the people who would use the information and ask for some validation (ask people to help to identify mistakes on the numbers).
  • Never deploy those things in the afternoon - when possible, try to take a look at the T-SQL on the next morning with a refreshed mind. I had many bugs corrected using this simple procedure.

Even with those procedures, I always worry about the numbers.

What are your best practices for ensuring the correctness of the reports?

+2  A: 

have you considered filling your tables with test data that produces known results and compare your query results with your expected results.

Athens
+1  A: 
  • Signed, in writing

I've found that one of the best practices is that both the reader/client and the developers are on the same (documented) page. That way, when mysterious numbers appear (and they do), I can point to the specification in writing and say, "This is why you see this number. Would you like it to be different?".

  • Test, test, test

For seriously complicated reports, we went through test data up and down with the client, until all the numbers were correct, and client was were satisfied.

  • Edge Cases

We discovered a seriously complicated case in our reporting system that turned everything upside down (on our end). What if the user generates a report (say Year-End 2009) , enters data for the new year, and then comes back to generate the same report? The data has changed but that report should not. Thinking and working these cases out can save a lot of heartache.

rlb.usa
A: 

Write some automated tests.

We have quite a lot of reporting services reports - we test them using Selenium. We use a test data page to squirt some known data into an empty database, then run the report and assert that the numbers are as expected.

The builds run every time we check in, so we know we haven't done anything too stupid

Jon Spokes