I have Person, SpecialPerson, and User. Person and SpecialPerson are just people - they don't have a user name or password on a site, but they are stored in a database for record keeping. User has all of the same data as Person and potentially SpecialPerson, along with a user name and password as they are registered with the site.

How would you address this problem? Would you have a Person table which stores all data common to a person and use a key to look up their data in SpecialPerson (if they are a special person) and User (if they are a user) and vice-versa?


Personally, I would store all of these different user classes in a single table. You can then either have a field which stores a 'Type' value, or you can imply what type of person you're dealing with by what fields are filled in. For example, if UserID is NULL, then this record isn't a User.

You could link out to other tables using a one to one-or-none type of join, but then in every query you'll be adding extra joins.

The first method is also supported by LINQ-to-SQL if you decide to go down that route (they call it 'Table Per Hierarchy' or 'TPH').

Chris Roberts

In the past I've done it exactly as you suggest -- have a Person table for common stuff, then SpecialPerson linked for the derived class. However, I'm re-thinking that, as Linq2Sql wants to have a field in the same table indicate the difference. I haven't looked at the entity model too much, though -- pretty sure that allows the other method.

+1  A: 

yes, I would also consider a TypeID along with a PersonType table if it is possible there will be more types. However, if there is only 3 that shouldn't be nec.

Sara Chipps
+2  A: 

If the User, Person and Special person all have the same foreign keys, then I would have a single table. Add a column called Type which is constrained to be User, Person or Special Person. Then based on the value of Type have constraints on the other optional columns.

For the object code it doesn't make much difference if you have the separate tables or multiple tables to represent polymorphism. However if you have to do SQL against the database, its much easier if the polymorphism is captured in single table...provided the foreign keys for the sub types are the same.

Mat Roberts
+1  A: 

I'd say that, depending on what differentiates Person and Special Person, you probably don't want polymorphism for this task.

I'd create a User table, a Person table that has a nullable foreign key field to User (i.e, the Person can be a User, but does not have to).
Then I would make a SpecialPerson table which relates to the Person table with any extra fields in it. If a record is present in SpecialPerson for a given Person.ID, he/she/it is a special person.

Lars Mæhlum
+12  A: 

There are generally three ways of mapping object inheritance to database tables.

You can make one big table with all the fields from all the objects with as special field for the type. This is fast but wastes space. Although modern databases save space by not storing empty fields. And if your only looking for all users in the table with every type of person in it things can get slow. Not all or-mappers support this.

You can make different tables for all the different child classes with all of the tables containing the base-class fields. This is ok from a performance perspective. But not from a maintenance perspective. Every time your base-class changes all the tables change.

You can also make a table per class like you suggested. This way you need joins to get all the data. So it's less performant. I think it's the cleanest solution.

What you want to use depends of course on your situation. None of the solutions is perfect so you have to weigh the pro's and cons


What I'm going to say here is going to send database architects into conniptions but here goes:

Consider a database view as the equivalent of an interface definition. And a table is the equivalent of a class.

So in your example, all 3 person classes will implement the IPerson interface. So you have 3 tables - one for each of 'User', 'Person' and 'SpecialPerson'.

Then have a view 'PersonView' or whatever that selects the common properties (as defined by your 'interface') from all 3 tables into the single view. Use a 'PersonType' column in this view to store the actual type of the person being stored.

So when you're running a query that can be operated on any type of person, just query the PersonView view.


@statictype - eeeeeeeerg! I guess if you have a low traffic site that is ok.

@Thomas - How many fields are uncommon to the three types?

Sara Chipps
those are comments not a response
+2  A: 

There's three basic strategies for handling inheritance in a relational database, and a number of more complex/bespoke alternatives depending on your exact needs.

  • Table per class hierarchy. One table for the whole hierarchy.
  • Table per subclass. A separate table is created for every sub class with a 0-1 association between the subclassed tables.
  • Table per concrete class. A single table is created for every concrete class.

Each of these appoaches raises its own issues about normalization, data access code, and data storage, although my personal preferance is to use table per subclass unless there's a specific performance or structural reason to go with one of the the alternatives.

+1  A: 

At the risk of being an 'architecture astronaut' here, I would be more inclined to go with separate tables for the subclasses. Have the primary key of the subclass tables also be a foreign key linking back to the supertype.

The main reason for doing it this way is that it then becomes much more logically consistent and you do not end up with a lot of fields that are NULL and nonsensical for that particular record. This method also makes it much more easy to add extra fields to the subtypes as you iterate your design process.

This does add the downside of adding JOINs to your queries, which can impact performance, but I almost always go with an ideal design first, and then look to optimise later if it proves to be necessary. The few times I have gone the 'optimal' way first I have almost always regretted it later.

So my design would be something like

PERSON (personid, name, address, phone, ...)

SPECIALPERSON (personid REFERENCES PERSON(personid), extra fields...)

USER (personid REFERENCES PERSON(personid), username, encryptedpassword, extra fields...)

You could also create VIEWs later on that aggregates the supertype and the subtype, if that is necessary.

The one flaw in this approach is if you find yourself heavily searching for the subtypes associated with a particulare supertype. There is no easy answer to this off the top of my head, you could track it programmatically if necessary, or else run soem global queries and cache the results. It will really depend on the application.

+1  A: 

This might not be what the OP meant to ask, but I thought I might throw this in here.

I recently had a unique case of db polymorphism in a project. We had between 60 to 120 possible classes, each with its own set of 30 to 40 unique attributes, and about 10 - 12 common attributes on all the classes . We decided to go the SQL-XML route and ended up with a single table. Something like :

PERSON (personid,persontype, name,address, phone, XMLOtherProperties)

containing all common properties as columns and then a big XML property bag. The ORM layer was then responsible for reading/writing the respective properties from the XMLOtherProperties. A bit like :

 public string StrangeProperty
get { return XMLPropertyBag["StrangeProperty"];}
set { XMLPropertyBag["StrangeProperty"]= value;}

(we ended up mapping the xml column as a Hastable rather than a XML doc, but you can use whatever suits your DAL best)

It's not going to win any design awards, but it will work if you have a large (or unknown) number of possible classes. And in SQL2005 you can still use XPATH in your SQL queries to select rows based on some property that is stored as XML.. it's just a small performance penalty to take in.