views:

551

answers:

4

Our development databases (Oracle 9i) use a remote database link to a remote shared database.

This decision was made years ago when it wasn't practical to put some of the database schemas on a development machine - they were too big.

We have certain schemas on the development machines and we make the remote schemas look local by using Oracle's database links, together with some synonyms on the development machines.

The problem I have is that I would like to test a piece of SQL which joins tables in schemas on either side of the database link.

e.g. (a simplified case):

   select a.col, b.col
   from a, b
   where a.b_id = b.id
  • a is on the local database
  • b is on the remove database
  • I have a synonymn on the locale DB so that 'b' actually points at b@remotedb.

Running the query takes ages in the development environment because of the link. The queries run fine in production (I don't think the Oracle cost based optimiser can cope very well with database links).

We have not been very good at writing unit tests for these types of queries in the past - probably due to the due to the poor performance - so I'd like to start creating some tests for them.

Does anyone have any strategies for writing a unit test for such a query, so as to avoid the performance problems of using the database link?

I'd normally be looking at ways of trying to mock out remote service, but since all this is in a SQL query, I can't see anyway of easily mocking out the remove database.

+2  A: 

Copy the relevant data into your development database and create the tables locally.

Ideally, just build a test case which tells you:

  1. The SQL is correct (it parses)
  2. It operates correctly with a few rows of test data

Don't fall for the "let's copy everything" because that means you'll have no idea what you're testing anymore (and what you're missing).

If in doubt, create a table b with just a single record. If you get an error in this area, add more rows as you learn where it can fail.

If you want to take this to the edge, create the test table (with all data) in a unit test. This way, you can document the test data you're using.

[EDIT] What you need is a test database. Don't run tests against a database which can change. Ideally, the tests should tear down the whole database and recreate it from scratch (tables, indexes, data, everything) as the first step.

In this test database, only keep well defined test data that only changes by defining new tests (and not by someone "just doing something"). If you can, try to run your tests against an in-memory database.

Aaron Digulla
Thanks. This would mean having to manage my synonyms between the local and remote databases, i.e. when running the tests point the synonyms locally, and then point them back remotely when the test ends. I guess this would work, but I was hoping there be a more transparent way.
A_M
Why do you need to have the link in the dev DB anyway? Just drop it and always do things locally. Or create a second link so you can check that both links still point to the a table with the same columns. But never run the tests over the link.
Aaron Digulla
We need the link so that our applications are functional in our development environments - the remote database is an Oracle applications suite, and all our core user data is stored in there. This is very static and it is shared across developers.
A_M
See my edits. Don't run applications against the JUnit test database.
Aaron Digulla
If you have a local synonym, and a table with the same name, Oracle will use the table (searches local schema for table first, then if not found checks for a local synonym, then a public synonym), so in theory you could create the local tables temporarily and the application should continue to work, and then drop the tables again leaving just the synonyms. It's a bit of a hack, but no more so than the database links probably!
Stephen ODonnell
@Stephen: That deserves it's own answer (instead of hiding it here as a comment).
Aaron Digulla
@Aaron: Yes, ideally I'd have a test database. But the database is an Oracle Applications database (Financials, HR, Procurement, etc) - and they are enormous out of the box (my DBA told me aroud 300GB). We've tried cutting the size of this down in the past, but not had much luck since the applications suite is highly integrated. We are getting to a state where we could perhaps do this, using more modern laptops and a virtual server running the DB, but that'd be a lot of expense.
A_M
@Stephen: Didn't realise that. Thanks for the tip.
A_M
A: 

I would suggest materialized views. These are views that store remote data locally.

northpole
+3  A: 

You should create exact copies of all the schema you need from production on development but without all the data. You should populate the schema with enough data so you can do a proper test. You can also manipulate the optimizer to behave on the test system to be like production by exporting the statistics from the production server and importing them to the development database for the schemas you are duplicating. That way the query will run with the data set you've made but the query will optimize with plans that is similar to that of production. Then you can estimate theoretically how it will scale on production.

MichaelN
Creating the schemas locally is a non-trivial exercise. It's an Oracle Applications database (http://www.oracle.com/applications/home.html - financials, HR, recruitment, etc). All the schemas and tables are highly coupled. There are dozens of schemas and thousands of database objects. In realitity, I'm only interested in about 5 to 10 tables from this database.
A_M
Can you not export the 5 to 10 tables from the database you are interested in with no data and import them to the dev db? You can choose to either include all referencing tables if you like a more complete set or just the tables that you want with/without data. You can create the user owning the table ahead of time in the dev db and assign them appropriate tablespace and quota, then import your export file to create the schema. Follow that by populating them with the appropriate data and then import stats.
MichaelN
A: 

Hi A_M,

In theory to do the unit-testing you can work with any set of controlled data created and designed based on your test-cases. It doesn't have to be your live or development system. That's assuming your unit is portable enough. You would test it with your current databases/application when you come to integration testing, which might as well be on the live system anyway (so no db links will be required - I understand your live databases are in one place).

What I'm trying to say, is that you can/should test your unit (i.e. your component, query or whatever you define as a unit) on a controlled set of data that would simulate different 'use cases' and once you complete your testing to satisfactory results, then you can proceed to integration + running integration tests.

Integration tests - you could run this in the live environment, but only after you've proved by unit-testing that your component is 'bullet-proof' (if that's OK with your company's approach/philosophy :) - sys admin's reaction:"Are you flippin creazy?!")

If you are trying to go back in time and test already implemented units, then why bother? If they've been in a production use for some time without any incidents then I would argue that they're OK. However, there's always a chance that your unit/query might have some 'slowly ticking time bomb' effect on the side (cumulative effect over time). Well, analyse the impact is the answer.

Anyway, sorry if I'm being a smartass - that's not my intention. And I hope this will be of any use to your work.

Have a good day! Damo

damo_inc
That's what we currently do. However, the controlled dataset is in a shared remote database, since historically that database has been too big to put in a local database. It's not just a matter of creating a table or two and populating them with data. It's an Oracle Applications database, i.e. a complicated 3rd party DB. The point about going back over old queries is that we then have a set of automated regression tests, e.g. you might change a query for non-functional reasons (e.g. performance). It would be good to know that you haven't broken it.
A_M