views:

56

answers:

4

Just a question about best-practices when upgrading an existing database. Assuming there will be all kinds of modifications to the data itself, the structure, the relations, additional columns, disappearing columns and whatever more.

My problem is a simple one. I'm working on a project that will use SQL Server. No problem there, since I'm enough of an expert to handle this. But this project will be upgraded later on and I need to specify a protocol that needs to be followed by the upgrade mechanism. Basically, this protocol needs to be followed when creating upgrade scripts...

Right now, I have these simple steps:

  1. Add the new columns to the tables.
  2. Add constraints to the new columns.
  3. Add new tables.
  4. Drop constraints where needed.
  5. Drop columns that need to be removed.
  6. Drop tables that need to be removed.

Somehow, this list feels incomplete. Is there a more extended list somewhere describing the proper steps which needs to be followed during an upgrade?

Also, is it always possible to do a complete upgrade within a single database transaction (with SQL Server) or are there breakpoints that need to be included within the protocol where one transaction should end and another one starts?


While automated tools will provide a nice, automated solution, I still can't really use them. The development team working on this system has 4 developers, each with their own database on their local system. Every developer keeps track of their own updates to the structure and keeps track of them by generating both an Upgrade and Downgrade script for his own modifications, both for structural changes and data changes. These scripts can then be used by the other developers to keep their own system up-to-date. Whenever the system is going to be released, those scripts are all merged into one big script.

The system does not include any stored procedures or other "special" features. The database is just that: a data storage with just tables and relations between them. No roles, no users, no stored procedures, no triggers, no complex datatypes...


The DB is used by an application where users work from 9-to-5 so shutting down can be done easily, including upgrades for the clients. We also add a version number to the database and applications will check if they're linked to the correct database version.

During development, all developers use their own database instance, which they can fully control. Since we're not the ones who use the application, we tend to develop for the Express edition, not any more expensive one. To be honest, we don't develop our application to support a lot of users, but we'll inform our users that since it uses SQL Server, they could install the system on a bigger SQL Server platform, according to their own needs. They will need their own DBA for this, though. We do have a bigger SQL Server available for ourselves, which we also use for our own web interface, but this server is located in a special dataserver where it is being maintained for us, not by us.

The project previously used MS Access for it's data storage and was intended for single-user development, but as it turned out, many users still decided to share their databases and this had shown that the datamodel itself is reliable enough for multi-user environments. So we migrated to SQL Server to support smaller offices with 3 or more users and some big organisation who will have 500 or more users at the same time.

Since we need to keep the cost of the software low, we don't have a big budget to spend on expensive tools or a more expensive server.

+1  A: 

In my opinion, it's an absolute bear doing these manually. For Microsoft SQL Server, I'd recommend using the Database editiion of Team System, since it includes complete source control capabilities for your database, and can automatically build your scripts for upgrading/downgrading versions.

Another option is SQLCompare with Redgate, which can also handle these kinds of upgrades/downgrades, and will result in a very nice SQL script. I've used both, and keeping the historic scripts has helped us troubleshoot issues and resolve many a mystery.

If you are working with a manual script as above, don't forget to also account for SP changes in your scripts. Also, any hand-edited script should be able to be executed multiple times on a database - i.e. if your script includes a table creation or drop, be sure to check for existance first, otherwise your script will fail if executed back to back.

Again, while it's possible to build a manual protocol I'd still fall back on using one of the purpose-built tools out there, and both Team System and SQL Compare will be able to output scripts that you could include as part of an installation/upgrade package.

Bob Palmer
Unfortunately, the project requires the developers to be a bit dynamic in making structural changes to the database. Thus a lot of upgrade scripts are generated which will be collected and then refactored before the final build. Since structural changes are heavily depending on code changes, we prefer to provide every developer with their own database, which he himself needs to maintain.
Workshop Alex
Hey Alex, The database edition of team system should still solve that issue. You can have multiple disparate databases build against a central schema, and by building it into source control it can help in solving any contention problems (it's definitely worth a look-see). If you're in a position where you cannot get these kinds of tools, let me know and I can talk about some less ideal but workable solutions I've used when in that situation (and with several coders working on their own local database changes).
Bob Palmer
Unfortunately, the development team can only work with the Express or Developer version. Ald although 2 developers also do .NET development in VS2008, but the project itself is developed in Delphi 2007/WIN32 and uses the raw ADO Api to connect to the database. (Thus not even the ADO controls of Delphi itself!)
Workshop Alex
Ok - and any reason why they are all working on different servers with their own schema updates? i.e. could they not coordinate off of a single development server?At this point your best bet would be to have a version controlled final script that they all work with on source control - i.e. do their change locally, verify, then modify the shared script that can then be rolled out as part of a build and have your unit tests hit it. It's a bear, but doable.
Bob Palmer
+1  A: 

With database updates I always believe it should be all or nothing. If any of the DB updates fail your application will be left in an unknown state that could be harmful to the data so I think it is best practice to either apply them all or none (1 transaction around them all).

I also like to backup the database before applying updates so that if anything does go wrong the database can be rolled back (this has saved me numerous times when working with live data).

Hope this helps.

Burt
+3  A: 

Check out Red-Gate's SQL Compare (structure comparison), SQL Data Compare (data comparison), and SQL Packager (for packaging up updates scripts into a C# project or a .NET executable).

They provide a nice, clean, fully functional and easy-to-use solution for all your database upgrade needs. They're well worth their license fees - that pays for itself in a few weeks or months.

Highly recommended!

marc_s
+1  A: 

Best practices for upgrading a production database schema actually look pretty bad on the surface. Unless you can completely shut down your system for the upgrade, which is often not possible, your changes all need to be backwards compatible. If you have many clients accessing the database, you can't update them all simultaneously, so any schema changes you make need to allow old code to run.

That means never renaming a column, and making all new columns nullable. This doesn't mean you leave it like that forever. You write two scripts, one for the initial change, which is backwards compatible, then another to clean things up after all clients have been updated.

Automated tools are great for validation of schemas, but they are not so good when it comes to actually modifying a complex system. You should break your changes up into many small, discrete change scripts so each can be run manually. If there's a failure, it's easier to pinpoint the cause and fix it. Basically, each feature gets its own script. Give each a unique name and then store that name in the database itself when you run the script so you can query the database to find out what's been run and what hasn't. This is invaluable when you have instances on developer's machines, test servers, production, etc.

Eric Z Beard
In this case, the server can be shut down during the upgrade. That's because the upgraded software will not be backwards compatible with older DB versions. This actually makes the use of automated tools much harder for us. The clients check the database version on start-up and will ask to be upgraded when the database is a different version. They won't work, unless upgraded.) (Version number is an extended property of the database we use.)
Workshop Alex
Ok, if you can shut things down that simplifies the process. I would still recommend leaving old columns and tables around for at least one iteration just in case you find a problem with migration scripts - it's never fun to drop a column and then realize you missed some data, especially when some time has gone by and it's not practical to restore a backup.
Eric Z Beard