views:

49

answers:

2

Hi There,

I have noticed that when designing a database I tend to shift any repeating sets of data into a separate table. For example, say I had a table of people, with each person living in a state. I would then move these repeating states into a separate table and reference them with foreign keys.

However, what if I was not storing any more data about states. I would then have a table with StateID and State in. Is this action correct? State is dependant on the primary key of the users table, so does shifting it into its own table help with anything?

Thanks,

+1  A: 

The State table should have no key relationship with the users table, it should only contain data about states.

What you may want to do to keep each table as simple as possible would be to keep the user data in a User table, the state data in a State table, and then build a join table that has foreign key relationships to both the User and State tables.

As for which form of normalization that is, I'm not sure.

Jacob Ewald
I see what you are saying but think there may be a little misunderstanding.I wouldnt link states to users in the states table. Essentially Im saying there would be a Users table, which would include stateID which would point to a state. But, if Im not storing any more info on states, is there a point? Just seems like a way to make queries more complex.
Sergio
I must have been confused by your last question: "State is dependant on the primary key of the users table" - I took that to mean you had some relationship linking a state back to a user record.You're right about the complexity of queries increasing whenever your normalize the data. There's a definite case of decreasing marginal returns and it's possible to over-normalize a database. One benefit you'll have if you do move the State data to a separate table is you that you'll be able to add more columns to the State table easily if a future enhancement requires it.
Jacob Ewald
A: 

I believe that removing subsets of repeating data within a table and placing them in tables of their own is called for in the process of placing a table in Second Normal Form.

Moving the state abbreviation into a table of its own is how you would normalize your database. It protects your “user” table from update anomalies where let’s say for some reason the abbreviation “KY” for Kentucky is updated to “KQ”. By placing a foreign key in the user table that contains the primary key of the states table you only have to make one update to the states table to correct this entry for all of your users.

That being said, it seems quite obvious to us that states abbreviations do not change often. So if you know for a fact that your database will never need to store more information about a state then it is logical and fundamentally sound to leave the state field in the user table. De-normalization of such is common. It will increase the readability of the data in your user table, and reduce the overhead of doing the join. It is however preference.

Gnatz