I have a relational database with about 100 tables. Each table has unique, numerical, primary key with synthetic values and there are many foreign keys which link the tables. The tables are not big (tens or hundreds or records). This is a SQLite database.
I need, for testing purposes, to compare two copies of the database by a linux script (simple bash scripts, perl, diff, sed are available). I need to validate that the number of records of both databases is the same and that the records have the same content and to dump the differences. The problem is, that the values of the keys are allowed to be different as far as the relations are the same.
For example:
There is a table "country" with primary key "ix_country" and "name" and a table "customer" with fields "name", primary key "ix_customer" and foreign key "ix_country".
These two databases are equal: first database:
country: name="USA" ix_country=1; customer: name="Joe" ix_customer=10 ix_country=1
second database:
country: name="USA" ix_country=1771; customer: name="Joe" ix_customer=27 ix_country=1771
Both copies have the same structure.
Is there an easy way to do this?
Update:
One more requirement - the script must be robust against changes in the structure. It must work if a table or a field is added or deleted.
Update 2:
I started to work on the problem myself. The general strategy is to write a SQL scripts which creates "identity map" file. The map contains for each record its primary key value ("artificial identity") and "natural identity" key - a string which uniquely identifies the record. For some tables in the database, there is an unique natural id key (like contry name in my example). Other tables require ordinal number in a sequence and still others combine its own identity with identity in parent (maybe recursively if the parent has also a parent).
All records are dumped to second text file by a second SQL script in a format which identifies the artificial identities.
The a perl script replaces all artificial identities in the second file with their natural identities from the map.
The the result is sorted and diffed.