views:

227

answers:

5

The database I'm designing has 3 major tables: BOOKS, ARTICLES, NOTES.
Each book or article can have multiple notes, my original design was just like that, which means both notes on books and notes on articles go the 'notes' table. Here are the columns for the NOTES table:

  • note_id
  • note_type
  • note_type_id
  • note_content

NOTE_TYPE can be either 'book' or 'article'; NOTE_TYPE_ID is the FK for a book_id if the note_type is 'book' OR an article id if the note_type is 'article'.

Now I start to wonder if that's the correct(or best normalized) design. An alternative approach is to use 5 tables

books / articles / notes / book_notes / article_notes

This way I can keep book notes and article notes separately, the columns are like

'notes' { note_id, note_content } 'book_notes' { book_id, note_id } 'article_notes' { articel_id, note_id }

Which one is correct or better?

+2  A: 

From a certain perspective it is much better in the long run to use

books / book_notes / articles /article notes

as a design principle for your database.

When you consider backups, data manipulation and data portability over time, having the attributes of a single entity in its own table starts to pay off.

Neither is really "better" in absolute terms, it depends on context. People are used to putting anything in a cupboard that fits, academic database designers tend to create a cupboard per toothbrush.

In your context, you may decide that the extra overhead of sql insert/select/update/delete for 3 notes tables instead of only one is not worth it. In the longer term, if you go with the "1 notes table" design initially and then decide you don't like it, splitting it into 3 is not like rewriting war and peace.

Steve De Caux
@Steve: Optimization isn't the top concern to me now, I'm looking for a more sensible schema, the one that's conceptually 'better'. That was how I meant 'better'. But still since book->notes * article->notes are both 1 to many relations, that's where I'm not quite sure about, since for 1->many relations, one table seems enough to hold it.
Shawn
@Shawn: "Conceptually" can be a bit tricky, as you can see from the other answers you have received. In terms of a 'more sensible schema' - if you mean a schema where the data model is quite evident, then the way to go is probably books, book_notes, articles, article_notes where the ?_notes tables are FKd to the PK of their corresponding entity. I know many people consider this a duplication of tables, but it does make the data model very easy to understand.
Steve De Caux
+5  A: 

Maybe a bit different approach -- supertype/subtype is usually used when you have very specific columns for each subtype, like in Person supertype with Patient and Doctor subtypes. Person holds all data common to people and Patient and Doctor hold very specific columns for each one. In this example your book_notes and article_notes are not really that different.
I would rather consider having a supertype Publication with Book and Article as subtypes. Then you can have just one Note table with FK to Publication. Considering that a PK number in Publication is the same number as the [PK,FK] of Book (Article) you can do joins with notes on Publication, Book or Article. This way you can simply add another publication, like Magazine by adding a new sub-classed table and not changing anything regarding Note.

For example:

TABLE Publication (ID (PK), Title, ...more columns common to any publication)
TABLE Book (ID (PK) = FK to Publication, ISBN, ... more columns specific to books only)
TABLE Article (ID (PK) = FK to Publication, ... more columns specific to articles only)
TABLE Note (ID, PublicationID FK to Publication, NoteText)

Primary key for Book and Article table also serves as a foreign key to the Publication.

Now if we add another publication, Magazine:

TABLE Magazine (ID (PK) = FK to Publication, ... more columns specific to magazines only)

We do not have to modify Note in any way -- and we have added columns specific to magazines only.

alt text

Damir Sudarevic
+1  A: 

NOTE_TYPE can be either 'book' or 'article'; NOTE_TYPE_ID is the FK for a book_id IF the note_type is 'book' OR an article id if the note_type is 'article'.

This relationship is called an arc when represented on a Logical data model.

It's fine if you don't foresee any note duplication. Not just between books, but articles too.

OMG Ponies
A: 

It depends on what you want to do with sub-typing. In your main tables books and articles seem like subtypes of "publications". However, there is no table for "publications". Is that because you don't need to search for publications, or because you didn't think in terms of "generalization-specialization relational modeling"? If you look up this phrase on the web, you'll see some good articles on the subject.

Assuming you don't need a generalized "publications" table, then you probably don't need a generalized "notes" table either. Are you going to be searching for notes where it doesn't matter which kind of publication the note refers to? How long is it going to be before you want to add a third or fourth kind of publication?

All of these have an impact on which design is "conceptually better". If you want something conceptually better, then you are optimizing, whether you realize it or not. You might be optimizing with regard to a different measure of goodness than speed or simplicity.

Walter Mitty
A: 
Cape Cod Gunny