views:

67

answers:

5

For instance, let's say I have a User model. Users have things like logins, passwords, e-mail addresses, avatars, etc. But there are two types of Users that will be using this site, let's say Parents and Businesses. I need to store some different information for the Parents (e.g. childrens' names, domestic partner, salaries, etc.) than for the Businesses (e.g. industry, number of employees, etc.), but also some of it is the same, like logins and passwords.

How do I correctly structure this in a SQL-based database? Thanks!

--UPDATE--

After digging a little bit more, I'm thinking that a polymorphic association might actually make more sense. But I don't know a lot about them, is there a reason why I wouldn't want to use them, or that STI is better? It seems like they're exactly the same except that the fields specific to Parents or Businesses are stored in separate tables, which I think is what I would want. Isn't it?

+3  A: 

I find the following to be the best approach in this situation:

  • create a table to hold both types of Users; this table would have a column for each property that the two types of Users share;
  • create an auxiliary table for each type, with a foreign key in each pointing back to the Users table; these tables would only have the extra properties for each type, plus the foreign key to Users;

This way you would avoid having data sparcity issues in your DB and will only have as many records in the extension tables are there are Users of that type. If you're using a code generation tool, you could generate for each extension table, which will then have a User as it's parent property, allowing you to access all the fields. You might also want to generate for the users table itself, for when you're processing all users regardless of their type.

I hope this makes sense, and that it helps.

Cheers!

Big Endian
+1  A: 

Typically you would have a Users table that has all of the fields that both Parents and Businesses have in common along with a unique ID called something like UserID. Then in the Parents table you would have the UserID (to link back to the Users table) and the fields that are unique to parents. Similarly you would have a Businesses table that has the UserID field and the fields that are unique to businesses. This is how you would setup a normalized database. The UserID fields in the Parents and Businesses tables would be a reference (foreign key) back to the Users table.

TLiebe
+1  A: 

In an object-oriented environment this would be modeled through inheritance. There are several ways to map inheritance into database tables. The most simple is Single Table Inheritance. If you have an object oriented environment consuming the database this is worth looking into.

Update: STI vs. other alternatives

If the number of fields that differ is small, I would go for STI as this is simple to implement and doesn't add the need for extra joins. The two other main alternatives are

  • Table per Class
  • Table per Concrete Class

Table per Class

If there are a lot of fields that are different between the types it is preferrable. The downside is that nearly all queries will require join operations betwen the base class table and one or more sub class tables.

Table per Concrete Class

To remedy the join problem of Table per Class a Table per Concrete Class is another way. However it requires common fields belonging to an abstract base class to be present in several tables. This violates the DRY principle and requires union queries to get the common properties from several concrete types.

Anders Abel
Well, I'm using Rails, and ActiveRecord does do Single Table Inheritance. I was actually looking at it, but the docs on it were a little sparse so I guess I misunderstood that it would work. But that link you gave is essentially what I want to do. I get how I would set up the Models, but I'm not sure I understand how the migration would work. I would just put all the columns for both Parents and Businesses in the Users table? How do I limit access to just Parent-related column in the Parent model?
Brandon Weiss
Yes, you put all columns needed by both types in the same table. To prevent incorrect usage you can add a constraint based on the type column that forces the unused columns to be NULL. If the access to the data is done through ActiveRecord it is up to that layer to only show the correct columns to the higher layers in the system.
Anders Abel
Well, I just spent some time reading up on STI and it looks like that's exactly what I need. Thanks for your (and everyone's) help! I may still go the NoSQL route, but it's good to know there isn't a too-painful way to do it with SQL.
Brandon Weiss
Ah, good to know. The number of fields that differ would actually be a lot. I'm not so worried about the joins because Rails makes it pretty easy. Looks like Polymorphic is the winner! Again, thanks for all your help :)
Brandon Weiss
+1  A: 

Modeling a relationship in the database adds a lot of complexity. For example, an auxilary table (aka a 1:1 relation) requires a join in each query:

select u.name, p.DomesticPartner
from users u
join users_partners p on p.userId = u.id

A dynamic column (1:many to name value pairs) adds one join for EVERY column you query:

select n.value as Name, p.value as DomesticPartner
from users u
join users_columns n on n.name = 'Name' and n.userid = u.id
join users_columns p on p.name = 'DomesticPartner' and p.userid = u.id

You gotta weigh this complexity against the added value the relation provides.

For myself, as a rule of thumb, I avoid database relations unless there is a compelling reason for them. Example compelling reasons:

  • Not knowing the column names at development time
  • Row size would exceed the maximum allowed by the database
  • The relation itself models information (f.e. Customers and Orders)

In your case, I'd just add all the columns to a user table. You can enforce unused columns must be zero using a check constraint:

CHECK (UserType = 'Parent' OR DomesticPartner IS NULL)
Andomar
A: 

Take a look at this question/answer with a similar problem, it describes supertype/subtype relationship.

Damir Sudarevic