views:

103

answers:

1

I'm designing the Fact and Dimension tables for a data warehouse currently using SQL Server, SSIS, and SSAS. Will I get any real benefit from programming the relationships between the dimensions and the fact tables into SQL? Or am I better off just defining the relationships manually when it comes time to create the cubes?

It seems to be easier to load and transform the data if I do not have any constraints upon my data insertion into the tables and therefore leave out the relationships.

+4  A: 

I am interpreting "programming the relationships" as meaning to put foreign key constraints on the tables.

No, in a data warehouse you should not impose primary key or foreign key constraints on the fact tables.

You've mentioned some issues, and another problem is that these constraints place a performance overhead when inserting rows, which will make the ETL process more expensive.

To someone only experienced with transactional database design, this might go against everything they've learnt and experienced. Foreign key constraints are vital for databases where you have multiple processes modifying data at the same time. There's a definite risk of two processes screwing up the data somehow, in spite of the best efforts of developers. The constraints are a fundamentally vital safety net.

In a dimensional model, the database is only ever populated by the one ETL process, and in a highly controlled way. This significantly decreases the risk of the data getting corrupted, to the point where the extra cost of constraints just aren't worth it.

Andrew Shepherd
Perfect answer! Thanks!
Sangheili