We have a fairly straight-forward client server system - Windows app pointing to a database (SQL Server or Oracle).
A separate database is set up for each customer and the customers are given several licenses of the Windows application to install on their computers.
Over the years, small customizations and fixes have been applied to one or another customer database. The fixes have not necessarily been propogated throughout the rest of the customer databases.
After five years of this, the group I am working with is now running into a large number of problems resulting from dropped primary keys, unenforced foreign keys, differing column types and widths, indexes that may or may not be clustered, views that return different datasets, and stored procedures that behave differently.
Since there is no "base-line" database, noone on the development team has a very hig degree of confidence that they can safely code against the database. For example, Bill may be coding against the ACME database, and Mary is coding against the CONTOSO database. However, the ACME and CONTOSO databases, while very similar, have subtle differences and neither Bill nor Mary is quite sure which variant to rely on.
Multiply this by 75 customer databases.
There is a versioning system in place - in that each DB has a version field. However ALL 75 of the databases have the exact same version.
Now I have been given the unenviable task of reconciling these 75 databases and coming up with a "gold standard" or "baseline" database which can be used as a template for developers to develop against. Additionally, the necessary scripts will be written such that, when applied to a customer database, will bring the customer database up-to-date and in agreement with the baseline.
I have identified two "TRAINING" database templates which our release guy uses when the training guy begins a new training session. These are fairly close, and I have already created the scripts to bring them into alignment.
There is a "DEMO" database, which the development team more or less agrees is a good candidate for a base-line as it is the one database that gets the most exercise. However there are quite a few deltas between the DEMO and TRAINING dbs.
So, my question is...
How would YOU go about reconciling the differences between all of these databases into a single "baseline", "gold standard", "template", etc?