tags:

views:

69

answers:

4

Every shop at which I've worked has had their own cobbled-together, haphazard, poorly understood and poorly maintained method for updating production databases.

I've never seen a consistent method for doing this.

So, in the most recent versions of SQL Server, what is the best practice for updating schema changes and migrating data from a development or test server to a production server?

Is there a 3rd party tool which handles this painlessly?

I'd imagine the ultimate tool would be able to

  • detect schema changes between two DBs and generate DDL to update one to the other.
  • include the ability to have custom code which performs custom data migration steps
  • allow versioning so a v1 db could be updated all the way to a v99 database, running all scripts and migration steps in order.
+2  A: 

The three things I've used are:

For schemas

Visual Studio Database Projects. Meh. They are okay but you still have to do alot of the work yourself.

Red Gate's SQL Compare and the entire SQL Toolbelt. They've worked pretty hard to make this something you can version control. In practice I've found with databases you are usually trying to get from point A in the version timeline to point B. With binaries, you often just clobber whatever is there with point B (an oversimplification I know, but often true).

http://www.red-gate.com/

xSQL is a good place to start if your system is small and perhaps will remain small:

http://www.xsqlsoftware.com/LiteEdition.aspx

I don't work for or know anyone who works for or get any money from these people. Just telling you what I've done in the past.

For data

Red Gate has SQL Data Compare.

However, if you want something "free" (or included with SQL Server) I've actually had a lot of success just using BCP and writing a small system that injects and extracts data. Generally when I find myself doing this I ask myself, "Why? If I am changing data, does that mean I am really changing something that is configuration? Can I use a different method here?" But sometimes you can't (maybe it's a legacy system where the original devs thought databases are for everything).

The problem with BCP extracts is they don't version control very well. There are tricks I've used like extracting in character mode and stuffing an order by in the extract query to try and pull rows out in an order that makes them somewhat more palatable for version control.

Chris Gomez
+1  A: 

For small Projects I have used RedGate to manage schema and data migrations with alot of success. Very easy to use works for most cases.

For larger enterprise systems for Schema and data changes normally you save all the SQL scripts as text files and run them. We also include a Rollback script to run incase something goes wrong during the migration. Run this on UAT server then Test/staging/pre prod server then on Production. Saving a copy of all these files plus their roll back scripts should allow you to move from multiple versions of a DB.

There is also http://code.google.com/p/migratordotnet/ if your using .NET it allows you to define these scripts in CODE. Very usesful if you want to deploy across multiple DBs in an automated way. Makes it easy to say set my DB to version 23. Or revert my DB to version 5. etc. Works for schema and data, but I would only really use it for a few lines of data.

Daveo
MigratorDotNet is the best answer I've seen to this question. It is the most advanced solution to this problem I've seen so far, so I will mark this answer as "accepted." Even though the other people who mentioned "it depends on your use case" are entirely correct, this tool seems it would fit a lot of use cases, except the most demanding.
Clever Human
+2  A: 

First you have to think that the requirements between scenarios vary a lot:

  1. Customers purchase v1 of the product at Costco and install it in they home office or small business. When v2 comes out, customer purchases a box of the product and installs it on a new computer. It exports the data from the v1 installation and imports it into v2 installation. Even though behind the scenes both v1 and v2 use a SQL Express instance there is no supported upgrade. Schema changes on the deployed databases are not expected (hidden database, non technical user) and definitely not supported. The only 'upgrade' path supported is an explicit export/import, which probably uses an XML file or something similar.

  2. A business purchases v1 of the product with a support contract. It installs it on its department SQL Server instance, from where the data is accessed by the purchased product and by many more integration services, reports etc. When v2 is released, the customer runs the prescribed upgrade procedure, if it runs into problems it calls the product vendor customer support line which walks the customer through some specific steps for his deployment. Database schema customizations are expected and often supported, including upgrade scenarios, but the schema changes are done by the customer (not known at v2 design time).

  3. A web startup has database that backs the site. Developers make changes on their personal instances and check in changes. Automated build deployment with contiguous integration picks up the changes and deploys them against a test instance, and run build validation tests. The main branch build can be, at any moment, deployed into production. Production is the one database that backs the site. The structure of the production database is documented and understood 100%, every single change to the production database schema occurs through the build system and QA process. On a side note, this is the scenarios most SO users that ask your question have in mind, minus the part about '100% documented and understood'. I give the example of WWW backing site, but deplyment can really be anything. The gist of it is that there is only one production database (it may include HA/DR copies, and it may consist of multiple actual SQL Server databases), and is the only database that has to be upgraded.

  4. A succesfull web startup. Same as above, but the production database has 5TB of data and 5 minutes of downtime make the CNN headlines. Schema changes may involve setting up replicas and copying data into new schemas with contiguous updates, followed by an online switch of operations to the replica. Schema changes are designed by MCM experts and deployn a schema change can be a multi-week process.

I can go on wit more scenarios. The point is that the requirement of each of these cases are so vastly different, that no 'state of the art' can answer all of them. Some scenarios will be perfectly OK with a schema diff deployment tool like vsdbcmd or SQL Compare. Other scenarios will be much better faced with explicit versioning scripts. Other might have such specific requirements (eg. 0 downtime) that each upgrade is a project on its own and has to be specifically custom tailored.

One thing is clear though across all scenarios: if your shop threats the development database MDF file* as 'source' and makes changes to it using the management tools, that is always a major #fail. All changes should be captured explicitly as some sort of source control artifact, and this is why I favor most the explicit version scripts, as in Version Control and your Database. But I recon that the VSDB project support for compile time schema validation and its ease of refactoring schema objects make a pretty powerful proposition and VSDB schema compare deployment may be OK.

Another important approache that has to be addressed is the code first schema modeling from tools like EF or LinqToSql. It works brilliantly to deploy v1, but fails miserably at any subsequent version. I strongly discourage these approaches.

But to sum up and answer in brief: as today, the state of the art sucks.

Remus Rusanu
Great summary, Remus. And it seems you are right, the state of the art *does* suck. With the exception of your last (number 4) list item, I'd think the other scenarios could be handled with the same mechanism (if one existed.)
Clever Human
+1  A: 

At Red Gate we'd recommend one of two approaches depending on your requirements and how formal you need your processes to be. If you have a development database and simply want to push changes to production, SQL Compare is the tool for the job. A level of versioning can be achieved by using the schema snapshots.

However, if you wants full source control benefits, such as team collaboration, sandboxed environments, audit trail, compliance, history, rollback, etc, you should consider SQL Source Control. This links development databases to Team Foundation Server or Subversion.

David Atkinson
On the RedGate site, I have not seen where SQL Compare is able to handle scenarios more complex than adding or removing fields from a table. There wasn't any mention of being able to handle denormalization scenarios (moving a column in a table to its own seperate table, and joining back to the original) or of handling the data migration inherent in that. Nor have I scene where you could add custom code to handle something more complex than what tSQL could do. (that said, I have heard very good things about RedGate, the tools just don't seem to fit here.)
Clever Human
You're right that this isn't in SQL Compare. The nomalization feature is Split Table and is in SQL Refactor (http://www.red-gate.com/products/sql_refactor/index.htm). However, these refactorings have been moved to the upcoming SQL Prompt 5, which is currently in early access (http://www.red-gate.com/MessageBoard/viewtopic.php?t=11980). This will generate a script to normalize, and will also preserve your data. Please let me know if this doesn't meet your requirements!
David Atkinson