views:

193

answers:

6

There are several questions on SO about version control for SQL and lots of resources on the web, but I can't find something that quite covers what I'm trying to do.

First off, I'm talking about a methodology here. I'm familiar with the various source control applications out there and I'm familiar with tools like Red Gate's SQL Compare, etc. and I know how to write an application to check things in and out of my source control system automatically. If there is a tool which would be particularly helpful in providing a whole new methodology or which have a useful and uncommon functionality then great, but for the tasks mentioned above I'm already set.

The requirements that I'm trying to meet are:

  • The database schema and look-up table data are versioned
  • DML scripts for data fixes to larger tables are versioned
  • A server can be promoted from version N to version N + X where X may not always be 1
  • Code isn't duplicated within the version control system - for example, if I add a column to a table I don't want to have to make sure that the change is in both a create script and an alter script
  • The system needs to support multiple clients who are at various versions for the application (trying to get them all up to within 1 or 2 releases, but not there yet)

Some organizations keep incremental change scripts in their version control and to get from version N to N + 3 you would have to run scripts for N->N+1 then N+1->N+2 then N+2->N+3. Some of these scripts can be repetitive (for example, a column is added but then later it is altered to change the data type). We're trying to avoid that repetitiveness since some of the client DBs can be very large, so these changes might take longer than necessary.

Some organizations will simply keep a full database build script at each version level then use a tool like SQL Compare to bring a database up to one of those versions. The problem here is that intermixing DML scripts can be a problem. Imagine a scenario where I add a column, use a DML script to fill said column, then in a later version that column name is changed.

Perhaps there is some hybrid solution? Maybe I'm just asking for too much? Any ideas or suggestions would be greatly appreciated though.

If the moderators think that this would be more appropriate as a community wiki, please let me know.

Thanks!

+1  A: 

for this kind of issue use Visual studio team system 2008 for version controlling of your sql database.

In tsf there are no. of feature avialbe like

  • Datacompare
  • Schemacompare
  • version controlling

about database version control : http://www.codinghorror.com/blog/2006/12/is-your-database-under-version-control.html for more detail check : http://msdn.microsoft.com/en-us/library/ms364062(VS.80).aspx

Pranay Rana
+3  A: 

You've got quite a rigorous set of requirements, I'm not sure whether you'll find something that puts checks in all the boxes, especially the multiple concurrent schemas and the intelligent version control.

The most promising tool that I've read about that kind of fits is Liquibase.
Here are some additional links:

crowne
Thanks for the suggestion. Yes, definitely a tough list of requirements, but I'm hopeful :) I'll see what LiquiBase can do.
Tom H.
+2  A: 

Yes, you're asking for a lot, but they're all really pertinent points! Here at Red Gate we're moving towards a complete database development solution with our SQL Source Control SSMS extension and we're facing similar challenges.

http://www.red-gate.com/products/SQL_Source_Control/index.htm

For the upcoming release we're fully supporting schema changes, and supporting static data indirectly via our SQL Data Compare tool. All changes are saved as creation scripts, although when you're updating or deploying to a database, the tool will ensure that the changes are applied appropriately as an ALTER or CREATE.

The most challenging requirement that doesn't yet have a simple solution is version management and deployment, which you describe very clearly. If you make complex changes to the schema and data, it may be inevitable that a handcrafted migration script is constructed to get between two adjacent versions, as not all of the 'intent' is always saved alongside a newer version. Column renames are a prime example. The solution could be for a system to be devised that saves the intent, or if this is too complex, allows the user to supply a custom script to perform the complex change. Some sort of version management framework would manage these and "magically" construct deployment scripts from two arbitrary versions.

As a product manager working on this very issue, I'd be happy to discuss this further. Contact me at David dot Atkinson at red-gate.com

David Atkinson
+2  A: 

I struggled with this for several years before recently adopting a strategy that seems to work pretty well. Key points I live by:

  • The database doesn't need to be independently versioned from the app
  • All database update scripts should be idempotent

As a result, I no longer create any kind of version tables. I simply add changes to a numbered sequence of .sql files that can be applied at any given time without corrupting the database. If it makes things easier, I'll write a simple installer screen for the app to allow administrators to run these scripts whenever they like.

Of course, this method does impose a few requirements on the database design:

  • All schema changes are done through script - no GUI work.
  • Extra care must be taken to ensure all keys, constraints, etc.. are named so they can be referenced by a later update script, if necessary.
  • All update scripts should check for existing conditions.

Examples from a recent project:

001.sql:

if object_id(N'dbo.Registrations') is null 
begin
    create table dbo.Registrations
    (
        [Id]                    uniqueidentifier not null,
        [SourceA]               nvarchar(50)     null,
        [SourceB]               nvarchar(50)     null,
        [Title]                 nvarchar(50)     not null,
        [Occupation]            nvarchar(50)     not null,
        [EmailAddress]          nvarchar(100)    not null,
        [FirstName]             nvarchar(50)     not null,
        [LastName]              nvarchar(50)     not null,
        [ClinicName]            nvarchar(200)    not null,
        [ClinicAddress]         nvarchar(50)     not null,
        [ClinicCity]            nvarchar(50)     not null,
        [ClinicState]           nchar(2)         not null,
        [ClinicPostal]          nvarchar(10)     not null,
        [ClinicPhoneNumber]     nvarchar(10)     not null,
        [ClinicPhoneExtension]  nvarchar(10)     not null,
        [ClinicFaxNumber]       nvarchar(10)     not null,
        [NumberOfVets]          int              not null,  
        [IpAddress]             nvarchar(20)     not null,
        [MailOptIn]             bit              not null,
        [EmailOptIn]            bit              not null,
        [Created]               datetime         not null,
        [Modified]              datetime         not null,
        [Deleted]               datetime         null
    );
end

if not exists(select 1 from information_schema.table_constraints where constraint_name = 'pk_registrations')
    alter table dbo.Registrations add
        constraint pk_registrations primary key nonclustered (Id);

if not exists (select 1 from sysindexes where [name] = 'ix_registrations_created')
    create clustered index ix_registrations_created
        on dbo.Registrations(Created);

if not exists (select 1 from sysindexes where [name] = 'ix_registrations_email')
    create index ix_registrations_email
        on dbo.Registrations(EmailAddress);

if not exists (select 1 from sysindexes where [name] = 'ix_registrations_email')
    create index ix_registrations_name_and_clinic
        on dbo.Registrations (FirstName,
                              LastName,
                              ClinicName);

002.sql

/**********************************************************************
  The original schema allowed null for these columns, but we don't want
  that, so update existing nulls and change the columns to disallow 
  null values
 *********************************************************************/

update dbo.Registrations set SourceA = '' where SourceA is null;
update dbo.Registrations set SourceB = '' where SourceB is null;
alter table dbo.Registrations alter column SourceA nvarchar(50) not null;
alter table dbo.Registrations alter column SourceB nvarchar(50) not null;

/**********************************************************************
  The client wanted to modify the signup form to include a fax opt-in
 *********************************************************************/

if not exists 
(
    select 1 
      from information_schema.columns
     where table_schema = 'dbo'
       and table_name   = 'Registrations'
       and column_name  = 'FaxOptIn'
)
alter table dbo.Registrations 
    add FaxOptIn bit null 
        constraint df_registrations_faxoptin default 0;

003.sql, 004.sql, etc...

At any given time I can run the entire series of scripts against the database in any state and know that things will be immediately brought up to speed with the current version of the app. Because everything is scripted, it's much easier to build a simple installer to do this, and it's adding the schema changes to source control is no problem at all.

Chris
Thanks. This is similar to what "pdc" posted in the link from Paddy. It's what I've been trying to sell for the past day now. Some of the people here are still tied to the idea of a model database with extra metadata and then using a compare tool along with additional, but maybe I can sway them. My approach actually still keeps scripts for a single object in one file as much as possible and as new changes happen they go in that file instead of constantly adding new scripts. Ordering between versions is the only challenge and I think I have a solution for that as well.
Tom H.
+1  A: 

We are using SQL Examiner for keeping database schema under version control. I've tried the VS2010 also, but in my opinion VS approach is too complex for small and mid-size projects. With SQL Examiner I mostly work with SSMS and use SQL Examiner to check-in updates to SVN (TFS and SourceSafe is supported also, but I never tried it).

Here is description of SQL Examiner's approach: How to get your database under version control

SQLDev
Thanks for the suggestion. Unfortunately, that only works in getting scripts into your source control system. My problem is at the time of deployment - automating that across many clients who are on different versions of the software.
Tom H.
SQL Examiner compares database scripts stored in the SVN with a target database and produces schema migration script. Also, you can compare version 10 in SVN with version 12 in SVN and generate script to migrate schema from version 10 to version 12.
SQLDev
A: 

Hey mate,

Try DBSourceTools. (http://dbsourcetools.codeplex.com)
Its open source, and specifically designed to script an entire database - tables, views, procs to disk, and then re-create that database through a deployment target.
You can script all data, or just specify which tables to script data for.
Additionally, you can zip up the results for distribution.
We use it for source control of databases, and to test update patches for new releases.
In the back-end it's built around SMO, and thus supports SQL 2000, 2005 and 2008.
DBDiff is integrated, to allow for schema comparisons.
Have fun, - Nathan.

blorkfish