views:

165

answers:

3

I'm working with Ruby on Rails, but this question I think is broader than that and applies to database design generally.

When is it a good idea to split a single model up into multiple tables? For example, assume I have a User model, and the number of fields in the model is really starting to add up. For example, the User can enter his website, his birthday, his time zone, his etc etc.

Is there any advantage or disadvantage to splitting up the model, such that maybe the User table only has basic info like login and email, and then there is another table that every User has that is something like UserInfo, and another that is UserPermissions, and another that is UserPrivacySettings or something like that?

Edit: To add additional gloss on this, most of the fields are rarely accessed, except on pages specific to them. For example, things like birthday are only ever accessed if someone clicks through to a User's profile. Furthermore, some of the fields (which are rarely accessed) have the potential to be extremely large. Most of the fields have the potential to be either set to blank or nil.

+1  A: 

This would be a situation for analysis.

When you find that a lot of the fields in such a table are NULLs, and can be grouped together (eg. UserContactInfo), it is time to look at extracting the information to its own table.

You want to avoid having a table with tens/hundreds of fields with only sparsely entered data.

Rather try to group the data logically, and crete the main table containging the fields that are mostly all populated. Then you can create subsets of data, almost as you would represent them on the UI, (Contact Info, Personal Interest, Work Related Info, etc) into seperate tables.

astander
What are the disadvantages associated with a table with sparsely entered data?
WIlliam Jones
+1  A: 

Retrieving a row is more expensive if it has many columns, especially if you usually need just some of the fields. Also, hosting stuff such as the components of an address in a separate class is a case of DRY. On the other hand, if you do need all fields of an object, it takes longer to execute a compound query.

I would normally not bother to distribute classes over several tables just to make the code more readable (i.e. without actually reusable parts like addresses).

Kilian Foth
+2  A: 

Generally it is a good idea to put things which have a one-to-one relationship in the same table. Unless your userbase includes the Queen or Paddington Bear, a user has just one birthday, so that should be an attribute of the USERS table. Things which have a one-to-many relationship should be in separate tables. So, if a user can have multiple privacy settings by all means split them out.

Splitting one table into several tables can make queries more complicated or slower, if we want to retrieve all the user's information at once. On the other hand if we have a set of attributes which is only ever queried or updated in a discrete fashion then having a separate table to hold that data is a sound idea.

APC