views:

246

answers:

3

Hi Guys

I have a SQL Server 2005 data-mart star schema with the usual fact and dimensions tables. This is deployed and being populated via an SSIS based ETL package. All fine so far.

I have just been approached by the customer with a new requirement. Never, I hear you say! This requirement will mean I need to add a new dimension table to the data-mart to measure a new aspect of the incoming facts which happen to be financial.

To be able to 'slice' the facts by this new dimension I need to add a new foreign key column in the fact table linking to the new dimension.

I am unclear on the best way to do this. What should I do with the data that has already been captured? Just make the new column null-able and accept that old fact will have a NULL? Actually, as I am typing, its dawned on me that I could update old facts as well. Or maybe I should create a separate (child?) fact table which would just contain a link to each new (parent) fact and a link to the new dimension.

I've not been able to find any information on a best practice for this type of change.

Any help would be much appreciated.

By the way. No Analysis Services used yet.

Thanks, Martin

+5  A: 
  1. Add the new dimension table.

  2. Populate it.

  3. Add the nullable reference on your existing facts.

    Not all facts may be joinable to the new dimension. This is common when you have new information. If you have all of your original files, you may have all the information required to update all facts.

    If all facts cannot be mapped to the new dimension, add a "N/A" row to the dimension. Sometimes there's a good reason to have several N/A rows depending on what you know about your facts.

  4. Update your facts so they all reference the new dimension -- either the proper dimension value or the special N/A row.

Ideally, you'll modify your fact table to make the column not-nullable. Sometimes this takes a painfully long time, and it's easier to export the data, redefine the table, and reload the data.

S.Lott
This is pretty comprehensive.
Cade Roux
Many thanks for this. This is probably where I had been heading, but it is always good to have a sanity check on these things. I've found BI to be have a steep learning curve. I have been using an N/A column in other dimensions and like this approach rather than having Nulls. I like the suggestion of exporting the data and reloading.Cheers!
Octoplasm
@Octoplasm: Nulls are a bad thing because they defeat any sensible roll-up behavior. Rows just "sort of vanish" from result sets. You almost always want the "other" or "N/A" even if a report filters it out from display to users.
S.Lott
+2  A: 

Well S.Lot answered most of it, I would just add that what to do with the old facts is a business decision. Make sure you ask them and get it in writing.

Damir Sudarevic
+1  A: 

S.Lott has a great answer, if you combine that with your suggestion of a "child" fact table, you could have another fact table as a "bridge table" with the new dimension and all the other same dimensions, but think about it - if you are going to that trouble, you could just make a new fact table design identical to the old, but with a non-nullable new dimension (the only expense over your bridge table would be the space of the facts anyway). Populate the history appropriately into the new table - linking either to a valid dimension or the N/A dimension for history as S.Lott suggested. Then start populating new data. You never have to go back and change the nullable to non-nullable. Then you can just deprecate the old fact table - or swap your presentation layer views (if you have those, this gives you a lot mofr freedom underneath with your facts and dimensions).

Cade Roux