tags:

views:

139

answers:

4

Scenario:

I have an application (C#) that expects a SQL database and login, which are set by a user. Once connected, it checks for the existence of several table and creates them if not found.

I'd like to expand on this by having the program be capable of adding columns to those tables if I release a new version of the program which relies upon the new columns.

Question:

What is the best way to programatically check the structure of an existing SQL table and create or update it to match an expected structure?

I am planning to iterate through the list of required columns and alter the existing table whenever it does not contain the new column. I can't help but wonder if there's an approach that is different or better.

Criteria:

Here are some of my expectations and self-imposed rules:

  • Newer versions of the program might no longer use certain columns, but they would be retained for data logging purposes. In other words, no columns will be removed.
  • Existing data in the table must be preserved, so the table cannot simply be dropped and recreated.
  • In all cases, newly added columns would allow null data, so the population of old records is taken care of by having default null values.

Example:

Here is a sample table (because visual examples help!):

id  datetime         sensor_name  sensor_status  x1    x2    x3    x4
1   20100513T151907  na019        OK             0.01  0.21  1.41  1.22
2   20100513T152907  na019        OK             0.02  0.23  1.45  1.52

Then, in a new version, I may want to add the column x5. The "x-columns" are all data-storage columns that accept null.

Edit:

I updated the sample table above. It is more of a log and not a parent table. So the sensors will repeatedly show up in this logging table with the values logged. A separate parent table contains the geographic and other logistical information about the sensor, making the table I wish to modify a child table.

+2  A: 

This is a very troublesome feature that you're thinking about implementing. i would advise against it and instead consider scripting changes using a 3rd party tool such as Red Gate's Sql Compare: http://www.red-gate.com/products/SQL_Compare/index.htm

If you're in doubt, consider downloading the trial version of the software and performing a structure diff script on two databases with some non-trivial differences. You'll see from the result that the considerations for such operations are far from simple.

The other way around this type of issue is to redesign your database using the EAV model: http://en.wikipedia.org/wiki/Entity-attribute-value_model (Pivots to dynamically add rows thus never changing the structure. It has its own issues but it's very flexible.)

(To utilize a diff tool you would have to have a copy of all of your db versions and create diff scripts which would go out and get executed with new releases and upgrades. That's a huge mess of its own to maintain. EAV is the way for a thing like this. It wrongfully gets a lot of flak for not being as performant as a traditional db structure but i've used it a number of times with great success. In fact, i have an HIPAA-compliant EAV db (Sql Server 2000) that's been in production for over six years with several of the EAV tables containing tens or millions of rows and it's still going strong w/ no big slow down. Of course we don't do heavy reporting against that db. For reports we have an export that flattens the data into a relational structure.)

Paul Sasik
Troublesome is right! I'd like to be able to release new versions to our customer so that they don't have to modify their database. I'm not sure (yet) how I'd utilize Sql Compare (as I am not really able to compare the *old* tables to the new ones (the new ones won't exist on their server). I will read up on EAV.
JYelton
Converting this project to EAV is probably the best solution. It will be a bit more long-term but it appears to deal with precisely this type of application.
JYelton
@JYelton - I disagree that an EAV is your best choice. Note the key piece near the end of Paul Sasik's post: "We don't do heavy reporting". That is a critical piece to the puzzle. Gleaning information, as opposed to just pulling data, is orders of magnitude more difficult in an EAV (along with data integrity and a host of other problems). If all you are doing is storing a wad of data that you never report against and never query for a specific attribute but instead always pull the entire wad of data, then an EAV is fine. Otherwise, you are probably asking for a maintenance headache.
Thomas
A: 

If you have to build something in such a way as to rely on the application making table changes, your design is flawed. You should have a related table for the sensor values (x1, x2, etc.), then you can just add another record rather than having to create a new column.

Suggested child table structure

READINGS ID int Reading_type varchar (10) Reading_Value int

Then data in the table would read:

ID Reading_type Reading_value 1 x1 2 1 x2 3 1 x3 1 2 x1 7

HLGEM
The example table represents a sensor with 4 values, however it is not a single entry. The same sensor will be queried regularly and its values logged. In reality there are hundreds of sensors and about 60 values. I will update my example to show this, but I'm not sure there's a way I can create a one-to-many relationship from this. I'd like to simply be able to add a new column to hold a new value if management comes up with a new attribute they wish to track.
JYelton
+1  A: 

The common solution i see would be to store in your database somewhere version information. maybe have a really small table:

CREATE TABLE DB_PROPERTIES (key varchar(100), value varchar(100));

then you could add a row:

    key | value
version | 12

Then you could just create a sql update script (or set of scripts) which updates the db from version 12 to version13.

declare v  varchar(100)
select v=value from DB_PROPERTIES where key='version'
if v ='12'
    #do upgrade from 12 to 13
elsif v='11'
    #do upgrade from 11 to 13

...and so on

depending on what upgrade paths you wanted to support you could add more cases. You could also obviously move this upgrade logic into C# and or whatever design works for you. But having the db version information stored in the database will make it much easier to figure out what is already there, rather than querying for all the db structures individually.

luke
This seems as difficult (or moreso) as my intention of checking each column against a list I already maintain in the program itself. Nevertheless, you made me think of storing what columns are added with which version, so there may be a way to avoid the iterative steps.
JYelton
A: 

Hey mate,
Try Microsoft.SqlServer.Management.Smo
These are a set of C# classes that provide an API to SQL Server database objects.
The Microsoft.SqlServer.Management.Smo.Table has a Columns Collection that will allow you to query and manipulate the columns.
Have fun.

blorkfish