views:

819

answers:

9

I'm currently in the process of testing our solution that has the whole "gamut" of layers: UI, Middle, and the omnipresent Database.

Before my arrival on my current team, query testing was done by the testers manually crafting queries that would theoretically return a result set that the stored procedure should return based on various relevancy rules, sorting, what have you.

This had the side effect of bugs being filed against the tester's query more often than against the actual query in question.

I proposed actually working with a known result set that you could just infer how it should return since you control the data present -- previously, data was pulled from production, sanitized, and then populated in our test databases.

People were still insistent on creating their own queries to test what the developers have created. I suspect that many still are. I have it in my mind that this isn't ideal at all, and just increases our testing footprint needlessly.

So, I'm curious, which practices do you use to test scenarios like this, and what would be considered ideal for the best end-to-end coverage you can get, without introducing chaotic data?

The issue I have is where's the best place to do what testing. Do I just poke the service directly, and compare that dataset to that which I can pull from the stored procedure? I have a rough idea, and have been successful enough so far, but I feel like we're still missing something important here, so I'm looking to the community to see if they have any valuable insights that might help formulate my testing approach better.

+3  A: 

Testing stored procs will require that each person who tests has a separate instance of the db. This is a requirement. If you share environments you won't be able to rely upon the results of your test. They'll be worthless.

You will also need to ensure that you roll back the db to it's previous state after every test so as to make the results predictable and stable. Because of this need to roll back the state after every test these tests will take a lot longer to complete than standard unit tests so they'll probably be something you want to run over night.

There are a few tools out there to help you with this. DbUnit is one of them and I also believe Microsoft had a tool Visual Studio for Database Professionals that contained some support for DB testing.

Justin Bozonier
+1  A: 

As part of our continuous integration, we run our nightly 'build' of the database queries. This involves a suite of DB calls which are updated regularly from the real calls in the code as well as any expected ad-hoc queries.

These calls are timed to ensure that:

1/ They don't take too long.

2/ They don't differ wildly (in a bad way) from the previous night.

In this way, we catch errant queries or DB changes quickly.

paxdiablo
These "suit of DB calls", are they clones of the real calls? And if so, do you maintain these along with any queries in code?
Steven Raybell
We extract each real call from the code and do some regex magic to make a static query from it, then add this static call to the suite. We also store the source line so we can detect changes and redo it.
paxdiablo
Interesting! Thanks!
Steven Raybell
+3  A: 

Here are some guidelines:

  1. Use an isolated database for unit testing (e.g. No other test runs or activity)
  2. Always insert all the test data you intend to query within the same test
  3. Write the tests to randomly create different volumes of data e.g. random number of inserts say between 1 and 10 rows
  4. Randomize the data e.g. for a boolean field random insert and true or false
  5. Keep a count in the test of the variables (e.g. number of rows, number of trues)
  6. For the Asserts execute query and compare against local test variables
  7. Use Enterprises Services transactions to rollback database to previous state

See the link below for the Enterprises Services Transaction technique:

http://weblogs.asp.net/rosherove/articles/DbUnitTesting.aspx

1-10 rows isn't going to test much. And especially not performance on 10M-1B rows.
Jonathan Leffler
+1  A: 

The query planner is your friend, especially in this case. It is always good practice to check to see that indexes are used when you expect them to be and that the query doesn't require extra work to be done. Even if you have stress tests included in your suite, it is still a good idea to catch expensive queries before your app starts grinding to a halt.

Dana the Sane
A: 

I find it useful to test the SQL being sent down to the database rather than the result of querying the database.

Not that I don't do the later, but I find it much faster to test for that than having the database too much lifting.

Allain Lalonde
+1  A: 

We have a blank database set aside for each developer and tester.

When the tests are run - each test clears the database and loads the data it is expecting to use. This gives us a known state at all times.

We can then test several different scenarios on the same DB (one after the other) and we never stamp on another testers toes.

That covers testing the data access itself. For service testing we do much the same thing but we test the inside of the service only - we don't actually hit the service we create an instance of the service processing class and pass in everything we need. That way we are testing the code and not the infrastructure (message etc..)

Brody
+1  A: 

Django offers a database unit test capability. You can borrow their design ideas and reproduce it in other environments.

The Django folks offer a subclass of Python's standard unittest TestCase class that populates a database with a known fixture -- a known set of data rows.

In the case of Django (and Python) it's easiest to populate the database from a JSON data extract. Other file formats for the fixture can be used for other frameworks. For example, if you're working in Oracle, you might find CSV files easier to work with.

This TestCase subclass allows to write a typical-looking TestCase that exercises the database with the known data fixture.

Additionally, the Django test runner creates a temporary schema for test purposes. This is easy for Django because they have a complete object-relational management component that includes DDL creation. If you don't have this available, you'll still need the DDL script so you can create and dispose of a test schema for unittest purposes.

S.Lott
A: 

So, I've seen several answers that basically boil down to the same end result:

  • Isolate (be it per individual or otherwise)
  • Each Test Run should zap all data, and load with whatever that test needs (isolate each test run)
  • Gather "Statistics" where appropriate (row counts, etc.)
  • Take advantage of tools (Enterprise Services Transactions, DbUnit, Django, etc.)

If I'm missing anything, or anybody else has any more experience to add to this, it'd be much appreciated. I'm going to try and build a case for some environment changes (resources permitting) to make this a bit more of a reality.

Thanks to everyone thus far for the responses. Provided some validation, as well as new tilts to my approach that could help improve confidence in the test results.

Steven Raybell
+1  A: 

SQLServerCentral has an article here (you may have to register but it is free and without strings) about a TSQL Unit Testing Framework called tsqlUnit. It is open source and follows along in the tradition of the xUnit framework.

It follows the SEAT TDD pattern:

Setup - prepare the test conditions by manipulating the objects, tables, and/or data

Exercise - invoke the production code

Assert - check that the actual result equals the expected result

Teardown - return everything back to the way it was before the test started. This is actually done by Rolling back a transaction, which keeps everything nice and tidy.

Although I have not used it, it looks promising and is certainly something I will be looking at in more detail.

The framework can be downloaded here.

jheppinstall