ansaurus

Question

How can I check that two relational databases are identical regardless of primary keys?

Answer 1

+5 A:

Is there an easy to to do this

No. It's going to take programming work.

Andy Lester 2010-10-13 17:29:47

I can write the program for the task; I just hope there is a trick unknown to me or a preexisting tool :-)

danatel 2010-10-13 17:34:24

Answer 2

+3 A:

If the database is pretty simple, running a query on the commandline which dumps all data properly formatted, without the ids, properly sorted and comparing with diffcould get you a large way.

e.g.

sqlite3 test.db 'CREATE TABLE Country (id  integer, name varchar(20))'
sqlite3 test.db 'CREATE TABLE Customer (id  integer, name varchar(20), country integer)'
sqlite3 test.db 'insert into country values (1, "USA")'
sqlite3 test.db 'insert into country values (2, "Belgium")'
sqlite3 test.db 'insert into customer values (1, "Joe", 1)'
sqlite3 test.db 'insert into customer values (1, "Peter", 2)'

sqlite3 test.db 'select cust.name, c.name from customer cust, country c where cust.country = c.id order by c.name, cust.name'

Peter|Belgium
Joe|USA

sqlite3 test.db 'select cust.name, c.name from customer cust, country c where cust.country = c.id order by c.name, cust.name' >db1.txt

doing the last query in a bash script, running it on both db's and diffing the 2 files will give you the different customers without programming.

This breaks down of course when the datamodel is more convoluted.

Peter Tillemans 2010-10-13 17:39:24

In particular this won't check that the relationships are the same, which is probably the hardest requirement. He doesn't care about the primary keys but he does care that they are referenced in the same structure.

mpeters 2010-10-15 14:47:53

Answer 3

A:

I started to work on the problem myself. The general strategy is to write a SQL scripts which creates "identity map" file. The map contains for each record its primary key value ("artificial identity") and "natural identity" key - a string which uniquely identifies the record. For some tables in the database, there is an unique natural id key (like contry name in my example). Other tables require ordinal number in a sequence and still others combine its own identity with identity in parent (maybe recursively if the parent has also a parent).

All records are dumped to second text file by a second SQL script in a format which identifies the artificial identities.

The a perl script replaces all artificial identities in the second file with their natural identities from the map.

The the result is sorted and diffed.

danatel 2010-10-26 05:13:19

ansaurus

tags:

views:

answers:

How can I check that two relational databases are identical regardless of primary keys?

related questions