views:

455

answers:

6

When I want to design database tables that model a finite parent-child relationship, for e.g., computer as parent, and the components inside as the child; or a simple organizational hierarchy, I am usually torn at which approach to use -

  1. A 'specialized table' approach, in which I create a table for each of the possible entity. In the computer component example, I would have a Computer table and a Component table, with the ComputerID as FK in the Component table referencing back to Computer.ComputerID. or
  2. A 'generalized table approach, in which I have one table table called Component, with ComponentID as PK, and a ParentComponentID as FK referencing to its parent's ComponentID.

Guys, what's your advice? Thanks

+1  A: 

I prefer the generalized table if I know I need the ability to extend the model to support new levels in a hierarchy.

The disadvantage with a generalized table is losing the strong typing, which I've usually solved by adding a 'Type' table (i.e. 'ComponentType' for the generic 'Component' table).

This allows for a model that can be extended, and still provides strong typing of each component in the hierarchy.

EDIT: Marcus brought up the question of Computer being the same as a Component. I should clarify that the generalized table approach should be used for like objects.

Jay S
Jay makes good points too. Basically, both ways are fine, but comes down to preference (or company policy).
Nick DeVore
+4  A: 

The reason I choose option 1 over 2 most of the time is because it is easier to pull my parent rows out of the database. Otherwise, you have to write the sql statement such that you get the rows from the table that have children rows, but only those. It gets messy quick.

If you're using any of the common relational databases, then use them the way they're designed to be used.

Nick DeVore
A: 

Having used both, the former is definitely easier to grasp by new programmers. It is also simpler to deal with in most simple cases.

Where the latter comes in handy, is where the hierarchy of things could change; or an object changes its hierarchical order or the hierarchy itself changes. It is also useful when the hierarchy is complex, as you don't have to model it as tables in the database.

Loki
+2  A: 

There's a great danger in relational design when you try to create generalisations that aren't actually true, but look elegant. If you have a general table with a type field, that introduces a form of indirection that has to be resolved in your code instead, which can hurt both performance and the clarity of the code.

On the other hand, such indirection can have its uses, and for certain purposes I use it quite a lot. The question you have to ask is whether Computer and Component really are aspects of the same thing, or whether you're just fudging them together because it sounds good.

Marcus Downing
thanks for your reply, check my own reply below.
chethong
Very good point Marcus. In this example, components are things inside of a Computer, and the computer itself is probably not a component. However, if you had several types of components (i.e. RAM, Processor, Fan, Motherboard) those could be typed easily.
Jay S
+1  A: 

In my experience, a Generalized table approach is tempting, because it gives you more power, but in the long term you will find it is complicated to follow and maintain.

Eduardo Molteni
A: 

Good answers from all. I can agree that the generalized approach can be tempting sometimes and will not be as easy for beginners to grasp and write proper SQL statement (think recursion). When I do use the generalized approach, I will have additional fields to give it strong typing. To Marcus, it is true that sometimes I may be just fudging them together because it sounds good, and I don't even realize it myself. Therefore I think the computer-component should use the specialized approach. However, for the organizational hierarchy, where those are just all people's record, I think the generalized approach could be better. So may be it really depends on situation, and I should not be expecting a general answer.

chethong
When in doubt, the answer is always "depends" ;)
Jay S
@chethong why don't you EDIT your question, append this answer and delete it?
Boris Pavlović
thanks Boris. I'm still finding it weird to see a comment to my own question, and having answers below the questions. Guess i'm still new here, will observe more.
chethong
Yes, Jay. Programming is fun because there's no right answer to anything.
chethong