views:

72

answers:

2

Is there official guidance or a threshold to indicate when it is best practice to use a foreign key in a MySQL database?

Suppose you created a table for movies. One way to do it is to integrate the producer and director data into the same table. (movieID, movieName, directorName, producerName).

However, suppose most directors and producers have worked on many movies. Would it be best to create two other tables for producers and directors, and use a foreign key in the movie table?

When does it become best practice to do this? When many of the directors and producers are appearing several times in the column? Or is it best practice to employ a foreign key approach at the start? While it seems more efficient to use a foreign key, it also raises the complexity of the database.

So when does the trade off between complexity and normalization become worth it? I'm not sure if there is a threshold or a certain number of cell repetitions that makes it more sensible to use a foreign key.

I'm thinking about a database that will be used by hundreds of users, many concurrently.

Many thanks!

+7  A: 

there are some official guidelines for this. they're called normal forms, and the practice of putting your database into them is called normalization: http://en.wikipedia.org/wiki/Database_normalization

if you take a db class in college, they'll probably teach you 3nf or bcnf. i've always found those approaches to be a bit heavy-handed, but i have enough experience in db design that i find these questions to be basically intuitive at this point...

in your example, you definitely want to use foreign key constraints. a many-to-one relationship is best expressed that way. it will make selecting movies a bit slower, because you'll have to do a join on the 'people' table and the 'movies' table - possibly many joins depending on how many 'people' fields the movies table has.

but the advantage is that you can easily manage the people themselves. if you want to change the spelling of a person name, you don't have to scan the whole table looking for that person in each field. you can avoid having the same person in the db several times with slight differences in spelling. you can set actions to take if a person is deleted. you can easily count how many different roles a person has had.

don't forget, if you want to use foreign keys, you must make your tables innodb in mysql.

Igor
Good post. Just to be clear however, you can implement foreign keys in MyISAM. They just won't be ***enforced***.
webbiedave
+1  A: 

Suppose you created a table for movies. One way to do it is to integrate the producer and director data into the same table. (movieID, movieName, directorName, producerName).

That's far too unnormalized. You're repeating data.

However, suppose most directors and producers have worked on many movies. Would it be best to create two other tables for producers and directors, and use a foreign key in the movie table?

Also suppose that a person may work on one movie as a producer and the next as a director. A single person could also be credited as director, producer, writer and actor on a single movie!

When does it become best practice to do this? When many of the directors and producers are appearing several times in the column? Or is it best practice to employ a foreign key approach at the start? While it seems more efficient to use a foreign key, it also raises the complexity of the database.

You will want to master foreign keys, relationships (especially 1-to-many and many-to-many), and normal forms right from the start. They will become second nature in no time.

webbiedave