Consider a dimensional model with fact tables like (fk_dim1value, fk_dim2value, ..., value)
where the fk_X
columns are foreign keys into corresponding trivial dimension tables dim1value (id, value), dim2value (id, value),
etc.
These fact-and-dimension tables are collected automatically from disparate sources, so there's a lot of them ... and they are redundant: all the dimension value tables are structurally identical, (id, value)
, representing simple collections of textual values with no further semantics (the only difference being the different foreign keys referencing them in the various fact tables). Less trivial dimension types will probably come up later, but the set of different types of dimensions will remain small.
So I want to merge the dimension tables into one table dimvalue (fk_dim, dimvalue_id, value)
where fk_dim
references a table dimension (dim_id, name)
, and dimvalue_id
is unique only within each dimension. The natural primary key is then composite: (fk_dim, dimvalue_id)
.
The fact table foreign-key columns now all reference the same table, dimvalue (fk_dim, dimvalue_id, value)
... but of course each column is associated with a particular dimension and thus should still be limited to referencing the values of that dimension specifically (a horizontal partition of the unified table dimvalue
).
Is there a (sensible) way to do this?
I mean something like a “half-composite” foreign key, i.e. a single-column reference to a “slice” of a composite PK, with a fixed value for the other column(s). A “fully-composite” FK would be FOREIGN KEY (col1, col2) REFERENCES dimvalue (fk_dim, dimvalue_id)
but here fk_dim
is fixed and so the “home” side of the key is just one column, referencing the second column of the dimvalue
primary key; something like FOREIGN KEY (fk_dim7value) REFERENCES dimvalue (fk_dim=7, dimvalue_id)
.
Is something like that possible? Or am I losing my way in this last paragraph? Should I give up and just foreign-key to the whole dimvalue
table and then add check constraints to limit by dimension? Or does referential integrity require me to give up even more and just accept all the separate identical tables?
(Impact of constraints on write performance is not important; read performance is a design goal.)