views:

130

answers:

5

I have question about normalization. Suppose I have an applications dealing with songs.

First I thought about doing like this:

Songs Table:
id | song_title | album_id | publisher_id | artist_id

Albums Table:
id | album_title | etc...

Publishers Table:
id | publisher_name | etc...

Artists Tale:
id | artist_name | etc...

Then as I think about normalization stuff. I thought I should get rid of "album_id, publisher_id, and artist_id in songs table and put them in intermediate tables like this.

Table song_album:
song_id, album_id

Table song_publisher
song_id, publisher_id

Table song_artist
song_id, artist_id

Now I can't decide which is the better way. I'm not an expert on database design so If someone would point out the right direction. It would awesome.

Are there any performance issues between two approaches?

Thanks

+2  A: 

That depends. If you can guarantee that a particular song always belongs to one single album, go for your first approach. If not, you have a n-to-n relationship and need a join table: that is your second approach. Both are completely ok in terms of normalization.

It is important that you design your database in a way you can map your data to it.

Dont worry about performance here. Performance depends more on how you optimized your indexes and how your queries look like, than on having to do one more join operation or not (your second approach, the join table, would need one more join in every query).

Philip Daubmeier
A: 

I'd stick with the first one, for two reasons:

  1. A song is only associated with one album, one publisher and one artist, so you don't need to create separate tables for them (if, for example, a song can have more than one artist, then create the song_artist table).
  2. It's more efficient. With the second approach you'll need to make some joins.
Yassin
Just out of curiosity, why -1?
Yassin
A: 

Songs can appear on multiple albums. Think a greatest hits release. Its important to zoom out of the technical muck and consider the real world use of an application (or database).

codingguy3000
Depends on the design. A possible design decision could be to dont let songs appear on more than one album, for some reasons... Maybe the song is remastered on the greatest hit release and you dont want to associate it with the original one. Maybe you want to ignore those very few songs that _really_ appear on different albums? The important thing is: one has to consider those things _before_ using the database, at design time.
Philip Daubmeier
+3  A: 

Forget about performance issues. The question is Does this model represent the data correctly?

The intermediate tables are called "junction tables" and they are useful when you can have a many-to-many relationship. For example, if you store the song "We Are the World" in your database, then you are going to have many artists for that song. Each of those artists are also responsible for creating many other songs. Therefor, to represent the data correctly, you will have to use junction tables, just as you did in the second version.

Jeffrey L Whitledge
+1  A: 

The first structure is mixing up the semantics (e.g. writing the publisher name for each single song). The second structure will allow you to put invalid data in the database (e.g. one song can belong to two albums). Here is what I understood from the problem domain and my suggestions for the design:

One album is published by only one publisher, thus you don't need to specify the publisher in every single song, you just need to put the publisher_ID in the Albums table. Also if you keep the artist_ID in the Songs table, each one of your songs can have only one artist at a time; but by putting the song_ID and artist_ID in a linkage table you can have multiple artists for one song (like the time when 2 singers sing one song together). The publisher_id goes to albums table as each album is published by one publisher. Also for table names it is always advised to use singular form.

Here is my suggested design:

Song Table:
id | song_title | album_id | ...

Album Table:
id | album_title | publisher_id | ...

Publisher Table:
id | publisher_name | ...

Artist Table:
id | artist_name | ...

Song_Artist Table:
song_id | artist_id | artist_role | ...