views:

247

answers:

3

I'm setting up Fact and Dim tables and trying to figure out the best way to setup my time values. AdventureworksDW uses a timekey (UID) for each time entry in the DimTime table. I'm wondering there's any reason I shouldn't just use a time value instead i.e. 0106090800 (My granularity is hourly)?

+3  A: 

"Intelligent keys" (in this case, a coded date and hour number) can lead to problems when you want to change definitions in your dimension. For example, your users might insist on a change from local time to UTC. Now your key is no longer actually a useful number, it's the old value in the dimension.

Further, with a midnight roll-over issue, the date part of your intelligent key might not match the actual date of the UTC vs. local time change.

To prevent the key from becoming a problem, you can't use it for any calculation of any kind. In which case, it's little better than a simple GUID or auto-increment number.

Auto-increment keys (or GUIDS) are fast and simple. Most important, they are trivially consistent across all dimensions.

Time happens to have a numeric mapping, but it helps to look at this is a weird coincidence, not a basis for a good design.

S.Lott
+1  A: 

Here's Ralph Kimball's latest on time dimension. It's dated 2004, but it's still good.

This one will help, too.

duffymo
A: 

The primary key should be surrogate, meaningless -- however, using YYYYMMDD for date dimension key is hard to resist, and also allows for easy table partitioning. The trick is that it should still be regarded as meaningless -- the fact that it looks like a date should be regarded as purely coincidental. This key should never be exposed to business users.

Damir Sudarevic