views:

39

answers:

3

A supply farm can have a transportation document. If present it can be one of two types: internal or external. Both documents share some common data, but have different specialized fields.

I though of modeling this in a OO-ish fashion like this: alt text

In the document table, one of the two doc_*_id is null, the other is the foreign key with the corresponding table.

That is opposed to the other schema where the common data is redundant: alt text

I'm trying to discover pros&cons of both approaches.

How do I SELECT to know all the internal docs in both cases? We have a sort of mutually exclusive foreign keys, the JOINs are not so trivial.

Is the first approach completely junky?

+1  A: 

Both approaches are correct and their usage will totally depend on the use cases, the kind and volume of data you want to store and the type of queries you want to mostly fire. You can also think of combining these two strategies when the inheritance hierarchies are complex.

One use case where the first approach would be preferred I think is when you want to search through all the documents, for example, based on description or any common field.

This document (although specific to hibernate) can provide a little more insight on different inheritance modelling strategies.

Gopi
+1  A: 

Classical ER modeling doesn't include foreign keys, and the gist of your question revolves around how the foreign keys are going to work. I think that what you are really doing is relational modeling, even though you are using ER diagrams.

In terms of relational modeling, there is a third way to model inheritance. That is to use the same ID for the specialized tables as is used for the generalized table. Then the ID field of the doc_internal table is both the primary key for the doc_internal table and also a foreign key referencing the supply_farm table. Ditto for the doc_external table.

The ID field in the supply_farm table is both the primary key of the supply_farm table and also a foreign key that references either the doc_internal or the doc_external table, depending. The joins magically get the right data together.

It takes a little programming to set this up, but it's well worth it.

For more details I suggest you google "generalization specialization relational modeling". There are some excellent articles on this subject out there on the web.

Walter Mitty
@Walter: "I think that what you are really doing is relational modeling, even though you are using ER diagrams." I have always used ER as an abbreviation of Entity Relationship - ie. it *is* relational modelling. What do you use ER to stand for?
Mark Bannister
same for me, ER=relational.
vulkanino
The ER model developed by Peter Chen in 1976 is intentionally different from the relational model developed by Ed Codd in 1970. You can read about it in Wikipedia and elsewhere. The ER model was intended for conceptual data analysis, not for database design. It was also intended to be agnostic between the relational model and the graph models called hierarchical and network. Many people use ER diagrams today to express a relational model, but that's a deviant use of ER modeling. No big deal, except it can lead to confusion in cases like this question.
Walter Mitty
PS: Yes, ER stands for "Entity Relationship".
Walter Mitty
A: 

If I have understood this correctly, then supply farm corresponds to either 0 or 1 documents, which is always either an internal or external document (never both).

If so, then why not just use a single table, like so:

**SUPPLY_FARM_DOC**
ID Int (PK)
DOC_ID Int
INTERNAL_FLAG Boolean
DESCRIPTION Varchar(40)
SOME_DATA Varchar(40)
OTHER_DATA Varchar(40)
etc.
Mark Bannister
The point is that I wanted to avoid flags of any sort and data redundancy.
vulkanino
Unless you have massively differing internal and external document data structures, data redundancy is likely to be minimal. What do you have against flags?
Mark Bannister