views:

106

answers:

2

i have taken over a database that stores fitness information and we were having a debate about a certain table and whether it should stay as one table or get broken up into three tables.

Today, there is one table called: workouts that has the following fields

id, exercise_id, reps, weight, date, person_id

So if i did 2 sets of 3 different exercises on one day, i would have 6 records in that table for that day. for example:

id, exercise_id, reps, weight, date, person_id
1, 1, 10, 100, 1/1/2010, 10
2, 1, 10, 100, 1/1/2010, 10
3, 1, 10, 100, 1/1/2010, 10
4, 2, 10, 100, 1/1/2010, 10
5, 2, 10, 100, 1/1/2010, 10
6, 2, 10, 100, 1/1/2010, 10

So the question is, given that there is some redundant data (date, personid, exercise_id) in multiple records, should this be normalized to three tables

WorkoutSummary:
- id
- date
- person_id

WorkoutExercise:
- id
- workout_id (foreign key into WorkoutSummary)
- exercise_id

WorkoutSets:
- id
- workout_exercise_id (foreign key into WorkoutExercise)
- reps
- weight

I would guess the downside is that the queries would be slower after this refactoring as now we would need to join 3 tables to do the same query that had no joins before. The benefit of the refactoring allows up in the future to add new fields at the workout summary level or the exercise level with out adding in more duplication.

any feedback on this debate?

+2  A: 

The new refactoring seems good, and performance will not be all that affected if you have the appropriate indexes on the various tables. (Indexes can be created on all foreign keys)

So YES, that seems like a perfectly normal refactoring.

astander
+6  A: 

Don't assume that queries will be slower after normalizing. Joins on a small number of tables are very cheap if the tables are indexed properly.

On the other hand, queries on a non-normalized table can easily end up being much slower. For example, in your original schema, simply trying to query the distinct dates on which a workout was done is far more expensive than it would be with the normalized version.

Definitely normalize it at this point. If you run into performance issues later on, then you can start selectively denormalizing certain parts of the data in addition to the already-normalized schema. But in all likelihood you will never reach that point with a small database.

Aaronaught
@Aaronaught - you say "if the tables are indexed properly". what fields do you recommend indexing here?
ooo
@oo: You should almost always index the foreign key field (`workout_id` in `WorkoutExercise` and `workout_exercise_id` in `WorkoutSets`). Depending on the database engine, you would likely want to make some or all of those indexes covering. I'm not sure what that `exercise_id` field is, presumably it's the type of exercise being done? If so, if you plan to have queries based on the exercise type ("has John been keeping up with squats?") then you'd probably want an index on that too.
Aaronaught
Add an index to anything that will appear in a WHERE clause, all primary, candidate, and foreign keys.
duffymo