views:

232

answers:

4

We have a few customers with large data sets and during our upgrade procedure we need to modify the schema of various tables (adding some columns, renaming others, occasionally changing data types, but that's rare).

Previously we've been going via a temporary table with the new schema, and then dropping the original and renaming the temp table but I'm hoping to speed that up dramatically by using ALTER table ... instead.

My question is what data integrity and error handling issues do I need to consider? Should I enclose all changes to a table in a transaction (and if so, how?) or will the DBMS guarantee atomicity and integrity over an ALTER operation?

We already heavily recommend customers backup their data before starting the upgrade so that should always be a fall back option.

We need to target SQLServer 2005 and Oracle, but obviously I can add conditional code if they require different approaches.

+1  A: 

If you're using SQL Server then ddl statements are transactional, so wrap in a transaction (I don't think this applies to Oracle though).

We split upgrades into individual patches that go with a particular feature. Which patches are applied go in a database_patch_history table, and it's easy to see which patches were applied and how to roll them back.

As you say, taking a backup before you start is important.

Neil Barnwell
A: 

If you dramatically change any data types of columns, for instance change a VARCHAR to an INT, the DBMS will panic and you will probably loose that data. Luckily, nowadays DBMSs are intelligent enough to do some data type conversions without loosing the data, but you don't want to run the risk of damaging any of it when making the alterations.

You shouldn't loose any data by renaming columns and definitely won't by adding new columns, it's when you move the data about that you have to be concerned.

Firstly, backup the entire table, both the schema and data, so at a second's notice you can roll back to the previous schema. Secondly, look at the alterations you are trying to make, see how drastic they are - try to figure out exactly what needs to change. If you're making datatype conversions push that data to an intermediatery table first with 3 columns, the foreign key (id or whatever so you can locate the row), the old data and the new column. Then either push the old data to the new column directly, or convert it at the application-level.

When it's all in the correct types and everything's been successful, run the ALTER statements and repopulate the database! It's simple enough to do, just needs a logical thought process.

Jamie Rumbelow
I wouldn't worry about losing data on datatype conversions too much. Just try it on a test database and see if it goes wrong. These type of scripts should get as much testing attention as the app, so if something doesn't work, you'll find out long before deploying to a live app.
Neil Barnwell
Oracle never panics :) ... but it will report ORA-01439 "column to be modified must be empty to change datatype" ... to change datatype you have to add a new column, update it, drop the old column, then rename the new column.
Jeffrey Kemp
+3  A: 

Comments for Oracle only:

  • Table alterations are DDL, so the concept of a transaction doesn't apply - every DDL statement locks the table for the duration of the operation and either succeeds or fails.

  • Adding (nullable!) columns or renaming existing columns is a relatively lightweight process and shouldn't present any problems if the table lock can be acquired.

  • If you're adding/modifying constraints (either NOT NULL or other more complex check constraints) Oracle will check existing data to validate the constraints unless you add the ENABLE NOVALIDATE clause to the constraint DDL. The validation of existing data can be a lengthy process for large tables.

  • If you're scripting the upgrade to be run as a SQL*Plus script, save yourself a lot of headaches by using the "whenever sqlerror exit sql.sqlcode" directive to abort the script on the first failure to make the review of partially implemented upgrades easier.

  • If the upgrade must be performed on a live system where you can neither control transactions or afford to miss them, consider using the Oracle DBMS_REDEFINITION package, which automatically creates a temporary configuration of temp tables and triggers to capture in-flight transactions while redefining the table in the "background". Warning - lots of work and a steep learning curve for this option.

dpbradley
+1 for DBMS_REDEFINITION
Jeffrey Kemp
+1  A: 

I have had to do changes like this in the past and have always been very paranoid about data loss. To help mitigate that risk I have always done tons of testing against "sandbox" databases that mirrored the target databases in schema and data as closely as possible. Test out the process as much as possible before rolling it out, just like you would any other area of the application.

Jesse Taber