views:

615

answers:

9

Background

I came from several years working in a company where all the database objects were stored in source control, one file per object. We had a list of all the objects that was maintained when new items were added (to allow us to have scripts run in order and handle dependencies) and a VB script that ran to create one big script for running against the database.

All the tables were 'create if not exists' and all the SP's etc. were drop and recreate.

Up to the present and I am now working in a place where the database is the master and there is no source control for DB objects, but we do use redgate's tools for updating our production database (SQL compare), which is very handy, and requires little work.

Question

How do you handle your DB objects? I like to have them under source control (and, as we're using GIT, I'd like to be able to handle merge conflicts in the scripts, rather than the DB), but I'm going to be pressed to get past the ease of using SQL compare to update the database.

I don't really want to have us updating scripts in GIT and then using SQL compare to update the production database from our DEV DB, as I'd rather have 'one version of the truth', but I don't really want to get into re-writing a custom bit of software to bundle the whole lot of scripts together.

I think that visual studio database edition may do something similar to this, but I'm not sure if we will have the budget for it.

I'm sure that this has been asked to death, but I can't find anything that seems to quite have the answer I'm looking for. Similar to this, but not quite the same:

http://stackoverflow.com/questions/340614/what-are-the-best-practices-for-database-scripts-under-code-control


Started a bounty, as I'm interested in canvassing for a few more opinions - the answers here are sound, but I feel that there should really be an easier way.

Thanks for all the great answers - all have their merits, so I'm going to take the highest vote, but cheers for all the input.

+5  A: 

We have all our database objects under source control using Visual Studio Database Edition (DBPro). It is a wonderful tool that version controls our schema, does builds, validations, allows code analysis, schema comparisons, deployments, data comparisons, refactoring etc. It was designed from the ground up to be a DB management and version control system. Highly recommended.

This is the blog site of the lead architect for DBPro: http://blogs.msdn.com/gertd/archive/2008/11/25/visual-studio-team-system-2008-database-edition-gdr-rtm.aspx

Randy Minder
I must take another look at this and test the water for budget for licenses...
Paddy
@Paddy - DBPro is free if you are currently using Visual Studio Team Suite or Visual Studio Developer Edition.
Randy Minder
A: 

I currently maintain a database design in a modelling tool (DeZine for Databases) and store that under source control. Within my table design I add a table with two rows which have the version number of the schema and of the reference data, this is updated each time the database is changed/released (users do not access this table).

Reference data is maintained in an Excel spreadsheet (also under source control) which can generate a SQL script of INSERT statements to populate new databases.

When a new release is required the schema script, reference data script and an installer package are sent out. The installer package renames the old database, creates a new database from the script and imports the new reference data (which may also have changed). The user's data is then copied from the old (renamed) database to the new one.

This has the advantage that when things go wrong you can revert to the original database as it has not been modified.

Tony
This is very similar to what I had previously done (apart from the DB rename and migration). Is it not just quicker to backup the DB and apply changes to the existing one rather than maintain all the data migration scripting?
Paddy
Yes, it would be faster to do that but sometimes the structure of the database changes in such a way that your existing data cannot "fit", so the action of porting it from one database to another allows for transformations to happen. Now I realise that databases do not always have dramatic changes with each release but it's good to have one upgrade process that can account for all situations. YMMV
Tony
+1  A: 

We have a system where the database is nominally the master-inside our source control system, we maintain a sequence of "schema change" scripts (.sql files), each of which is responsible for idempotently rolling back the change and then applying it. Each script is just numbered, so we have 000.sql (which creates the database and sets up standard objects), 001.sql etc.

During development, a developer writes a schema change script and runs it against the development database. Each change is required to add a row into a dba.change_info table, containing the change number and a brief description. In order to roll back a change, one can just run the first section of it. For SQL Server, the idempotence of the rollback section is handled by examining sysobjects etc before issuing DROP commands- similar to "drop ... if exists" constructs. Schema changes may need to do migration of data if a model is being changed rather than simply being added, and also are used to maintain reference data.

During the release process, a DBA (we're a small company, so this is a role taken on by one of the developers anyway) applies the schema changes for the release to the production database between stopping the old version of the applications and starting the updated ones.

This is all quite a manual process, but satisfies requirements such as migrating data from one model to another: e.g. expanding a boolean flag to a set of options, or converting a many-to-one association to a many-to-many. This typically isn't something that can be generated with simple schema-comparison tools anyway. It also allows for role separation- although in practice we all have full access to production, there is enough decoupling there so that the "DBA" can read and review the .sql files to be applied in production.

In theory, at least, a complete database (containing only reference data) could be built by simply running all schema changes in order for 000.sql onwards. In practice we don't regularly do this, but rather copy our production database to dev and then apply the change scripts before running regression tests prior to a release. This serves to test the change scripts themselves, but is only practical with a medium size production database.

araqnid
Sounds thorough anyway! I think I'd rather be able to recreate the current production database from a set of 'master' scripts than rely on a lot of change update scripts - I can see how this could get complicated...
Paddy
@Paddy as mentioned, recreating the database from scratch simply isn't something we've got a need to do; although that's based on us being able to simply copy the prod database to dev and apply the outstanding changes. by contrast, having to evolve models and migrate data is something we have to do a lot, and you'll always need to write some sort of script to do that if you're making any non-trivial changes.
araqnid
True. We don't have the option of getting our hands on a copy of the production database(s), so this might not be for us.
Paddy
+2  A: 

If you're already using Red Gate tools, you might consider using SQL Source Control, which works side by side with SQL Compare and SQL Data Compare to allow one version of the truth to exist in source control. It's in early access at the moment, but most of the functionality is in there to be tried out. You can download this from http://www.red-gate.com/Products/SQL_Source_Control/index.htm . However, it only supports SVN and TFS for the moment. Have you standardized on GIT?

David (Product Manager at Red Gate)

David Atkinson
That looks very interesting (particularly if it's as easy to use as the other tools). We are currently very happy with GIT and the ease of branching, but we may be taking another look at TFS with the release of VS 2010.
Paddy
Just by the by, do you have any plans to extend the SCM software that you will support - I see a lot more people using GIT around here?
Paddy
We're starting off with SVN and TFS and we'll consider the others on a case by case basis. GIT seems to be increasingly popular and we've already had a numver of requests. If we see an accelerated uptake of this, of course we'll want to support it. Have you tried our Early Access build? If you don't have SVN or TFS you can request a login using our hosted SVN server. We'd love to get your thoughts.
David Atkinson
+4  A: 

Have a look at this five part series on the principles and practices of database version control (by K. Scott Allen):

  1. Three rules for database work
  2. The Baseline
  3. Change Scripts
  4. Views, Stored Procedures and the Like
  5. Branching and Merging

The five parts are important but basically the idea is to have a baseline and then change scripts (with a version table). Updating the database means applying change scripts "above" the current version. And this strategy is very VCS friendly (no conflicts).

Pascal Thivent
A: 

There's a special tool for this exact thing. It's called Wizardby:

...database continuous integration & schema migration framework

Wizardby Workflow

Anton Gogolev
Looks interesting, must download for some more perusal - this is an alpha release however, so not sure how it would go down in here.
Paddy
A: 

I am not very familiar with RedGate toolkit, but if it is any similar to dbGhost, there must be a utility that allows you to script the database objects to the files one per object. In this case I would suggest following:

  • add a daily (or part of a build) job to reverse-engineer the DEV database into the directory structure
  • then compare it to what you have in repository (by means of simple diff), and basically FAIL the build job and report the diff if any. This will indicate that the structure of the DEV database has changed and is not reflected in the source control,
  • which will indicate to the developer to add the changes to the source control (even use the reported .diff file for this)

If you many DEV databases (one per user or development branch) and it is too cumbersome, then probably a better combination would be to do such task on the STAGE (TEST just before release) version of the database, at which point you would store the PROD schema in the repository and would update it from the STAGE only during the pre-release testing phase, where you will ensure that your schema changes are also in the repository.

This way developers can still work in the usual way: change the schema on the DEV database first, and hopefully you get the balance between the flexibility and one truth you would like.

In my team we add change to VCS as soon as we change the DEV database, but we still do have such task to compare the schema between different databases (DEV, STAGE and PROD). Basically, we follow what I once answered in How should you build your database from source control?.

van
A: 

Assuming that you use the .net framework, have a look at the Fluent Migrator and also the Hearding Code Podcast that talks about the project.
The main aim as I see it is to easily code the migrations as you do your normal coding using a fluent interface using a database agnostic approach.

It is built on top of the .net framework. and works with a number of database formats including SQL Server, SqlLite and MySQL.

The advantage of the this approach is that it lives with the rest of your code and can therefore be managed by SCM

Example:

   [Migration(1)]   
   public class CreateProjectsTable : Migration   
   {   
       public void Up()   
       {   
          Create.Table("Projects")              
            .WithIdColumn()             
            .WithColumn("Name").AsString().NotNullable()                
            .WithColumn("Position").AsInt32().NotNullable()             
            .WithColumn("Done").AsBoolean().NotNullable();
       }  
       public void Down()  
       {  
           Database.RemoveTable("Projects");  
       }  
   }
Nathan Fisher
A: 

At work we make heavy use of a powerful tool which comes as part of ActiveRecord (which is the default ORM that comes with the Rails web framework called Migrations.

A basic migration would look like the following:

class AddSystems < ActiveRecord::Migration
  def self.up
    create_table :systems do |t|
      t.string  :name
      t.string  :label
      t.text    :value
      t.string  :type
      t.integer :position, :default => 1
      t.timestamps
    end
  end

  def self.down
    drop_table :systems
  end
end

There is a Migration created for every database change, and they are created in sequential order by timestamp. You can run pre-defined methods to run these migrations in the proper order so your database can always be created and/or rolled back. Some of the functions are below:

rake db:migrate #run all outstanding migrations
rake db:rollback #roll back the last run migration
rake db:seed #fill the database with your seed data

Migrations have methods for creating tables, dropping tables, updating tables, adding indexes, etc. The full suite. The migrations also automatically add an id column, and the t.timestamps section automatically generates a "created_at" field and an "updated_at" field.

Most languages have ORM facilities such as these, and they allow the database to be maintained in a code-like state, which is easy for developers to understand, as well as being simple enough for DBA's to use and maintain.

Mike Trpcic