ansaurus

Question

Answer 1

+3 A:

Make dummy site nodes for the error or not found page. You can mark them as a specific type of node as per your first option. This will make it easier to make a generic handler mechanism. It will also make joins simpler which will help with database query performance. Additionally, it allows you to add more types of 'special' pages (perhaps a login screen) or make this configurable without having to modify your database schema.

ConcernedOfTunbridgeWells 2009-02-11 17:02:56

Answer 2

A:

Another option is to have 2 columns site_id and site_node_id like this:

create table pages
 ( page_id ... primary key
 , site_id references sites
 , site_node_id references site_nodes
 , ...
 , constraint site_or_node check (  site_id is null and site_node_id is not null
                                 or site_id is not null and site_node_id is null
                                 )
 );

Now you can use referential integrity to ensure that every page belongs to either a site or a node, and not both.

Tony Andrews 2009-02-11 17:16:30

Answer 3

+2 A:

Option #1 ensures that each page belongs to one and only one other entity, although the relationship cant actually be enforced as the parentId field can point to more than one place.

Right. In terms of relational theory, the problem is that your "parentId" column violates third normal form, because its meaning varies per row, based on the value in parentType (a non-key column).

You wouldn't have a properly designed database if a single column could contains someone's phone number or their birthdate, per row, depending on some other flag. Those are two different facts about the person, and they each deserve their own column. Likewise, storing both site_id or node_id in a single column would have the same problem.

Another clue that this is a flawed design is that you can't declare a foreign key constraint to point to either of two referenced tables.

Option #2 is cleaner, but it's basically saying that the site "belongs" to the two error and not found pages, and that's probably bad practice.

I see why you're saying that, because of the belongs to conventions in Rails-like frameworks. But these are conventions; they aren't necessarily the only relationship that foreign keys can model. You can make one entity refer to exactly one other entity, in a has one relationship. In this case, the foreign key reverses direction.

I would say it's logically true that the Error page and the Not Found page belong to the site, not the other way around. And the way to make them mandatory is to have another entity reference these pages, and apply the NOT NULL constraint to these references. This is what you've described.

CREATE TABLE site (
  . . .
  error_page_id     INT NOT NULL,
  notfound_page_id  INT NOT NULL,
  FOREIGN KEY (error_page_id)    REFERENCES pages (page_id),
  FOREIGN KEY (notfound_page_id) REFERENCES pages (page_id)
);

This meets your immediate need, it's enforceable, and it's in Normal Form.

@NXC suggests making dummy nodes for Error and Not Found pages. Although this allows these nodes to be stored in the node hierarchy, it fails to enforce that a site must have these pages. That is, a site could be stored without references to these nodes.

@Tony Andrews suggests storing two columns in each page, site_id and site_node_id, and adding a CHECK constraint to ensure that exactly one of these is non-NULL. This seems better than the parent_id/parent_type option, but it still doesn't offer any enforcement that every site must have an Error and a Not Found page.

Bill Karwin 2009-02-11 17:27:46

Thanks, that does make the most sense. As the pages cant be deleted via a cascade delete with that relationship (as the FK is reversed), would you suggest using a trigger to delete the linked error pages, or something else?

Jack Sleight 2009-02-11 17:40:36

Probably a trigger if you want to do it automatically. But you might want a given error page to be used by multiple sites, so in that case maybe you don't want it to be automatic.

Bill Karwin 2009-02-11 17:48:42

Thanks Bill, going to go with this option. Good point about the trigger, need to decide how I want to do that.

Jack Sleight 2009-02-11 17:55:04

What happens when you MUST have term of service page and a registration page and something else page? This sounds like you need a new table "required_pages".

jmucchiello 2009-02-11 18:24:32

@joe: Yes I qualified my answer with, "this meets your immediate need." If you have more mandatory pages, of course you need more columns. I don't see a need to store this in a separate table. The `sites` table is not likely to be large, or to be subject to frequent updates.

Bill Karwin 2009-02-11 18:40:53

Answer 4

A:

Option 2 makes much more sense and will save your brain later down the line if and when more complications spring up. The one-to-one relationship of site to error/notfound page makes it perfect for a foreign key constraint.

notnot 2009-02-11 17:37:48

Answer 5

A:

A modification of Option 1.

Include two separate columns, ParentNodeID and ParentSiteID. Leave one of these two columns NULL, depending on the case. Now you can still declare a foreign key (references) constraint for each foreign key.

I don't really understand the SiteNotFound case. Could you leave both foreign keys NULL in this case?

Your joins and searches will be simpler. You will also be adhering to 1NF. This is not a coincidence.

Your option 1 combines values drawn from different domains in a single field. This is bad field design and IIRC violates 1NF.

Walter Mitty 2009-02-11 18:22:06

ansaurus

tags:

views:

answers:

Tricky relational DB design question

related questions