Option #1 ensures that each page
belongs to one and only one other
entity, although the relationship cant
actually be enforced as the parentId
field can point to more than one
place.
Right. In terms of relational theory, the problem is that your "parentId
" column violates third normal form, because its meaning varies per row, based on the value in parentType
(a non-key column).
You wouldn't have a properly designed database if a single column could contains someone's phone number or their birthdate, per row, depending on some other flag. Those are two different facts about the person, and they each deserve their own column. Likewise, storing both site_id or node_id in a single column would have the same problem.
Another clue that this is a flawed design is that you can't declare a foreign key constraint to point to either of two referenced tables.
Option #2 is cleaner, but it's
basically saying that the site
"belongs" to the two error and not
found pages, and that's probably bad
practice.
I see why you're saying that, because of the belongs to conventions in Rails-like frameworks. But these are conventions; they aren't necessarily the only relationship that foreign keys can model. You can make one entity refer to exactly one other entity, in a has one relationship. In this case, the foreign key reverses direction.
I would say it's logically true that the Error page and the Not Found page belong to the site, not the other way around. And the way to make them mandatory is to have another entity reference these pages, and apply the NOT NULL
constraint to these references. This is what you've described.
CREATE TABLE site (
. . .
error_page_id INT NOT NULL,
notfound_page_id INT NOT NULL,
FOREIGN KEY (error_page_id) REFERENCES pages (page_id),
FOREIGN KEY (notfound_page_id) REFERENCES pages (page_id)
);
This meets your immediate need, it's enforceable, and it's in Normal Form.
@NXC suggests making dummy nodes for Error and Not Found pages. Although this allows these nodes to be stored in the node hierarchy, it fails to enforce that a site must have these pages. That is, a site could be stored without references to these nodes.
@Tony Andrews suggests storing two columns in each page, site_id
and site_node_id
, and adding a CHECK constraint to ensure that exactly one of these is non-NULL. This seems better than the parent_id
/parent_type
option, but it still doesn't offer any enforcement that every site must have an Error and a Not Found page.