ansaurus

Question

How can you represent inheritance in a database?

Answer 1

A:

Check out the answer I gave here

http://stackoverflow.com/questions/3571368/fluent-nhibernate-one-to-one-mapping-with-synthetic-keys/3571432#3571432

Zoidberg 2010-08-26 20:13:20

Answer 2

+3 A:

The 3rd option is to create a "Policy" table, then a "SectionsMain" table that stores all of the fields that are in common across the types of sections. Then create other tables for each type of section that only contain the fields that are not in common.

Deciding which is best depends mostly on how many fields you have and how you want to write your SQL. They would all work. If you have just a few fields then I would probably go with #1. With "lots" of fields I would lean towards #2 or #3.

David 2010-08-26 20:15:48

+1: 3rd option is the closest to the inheritance model, and most normalized IMO

RedFilter 2010-08-26 20:29:39

Your option #3 is really just what I meant by option #2.There are many fields and some Section would have child entities too.

Steve Jones 2010-08-26 20:58:15

Answer 3

A:

I lean towards method #1 (a unified Section table), for the sake of efficiently retrieving entire policies with all their sections (which I assume your system will be doing a lot).

Further, I don't know what version of SQL Server you're using, but in 2008+ Sparse Columns help optimize performance in situations where many of the values in a column will be NULL.

Ultimately, you'll have to decide just how "similar" the policy sections are. Unless they differ substantially, I think a more-normalized solution might be more trouble than it's worth... but only you can make that call. :)

djacobson 2010-08-26 20:22:05

There will be way too much information to present the whole Policy in one go, so it'd never be necessary to retrieve the whole record.I think it is 2005, although I have used 2008's sparse in other projects.

Steve Jones 2010-08-26 20:59:55

Answer 4

+2 A:

With the information provided, I'd model the database to have the following:

POLICIES

POLICY_ID (primary key)

LIABILITIES

LIABILITY_ID (primary key)
POLICY_ID (foreign key)

PROPERTIES

PROPERTY_ID (primary key)
POLICY_ID (foreign key)

...and so on, because I'd expect there to be different attributes associated with each section of the policy. Otherwise, there could be a single SECTIONS table and in addition to the policy_id, there'd be a section_type_code...

Either way, this would allow you to support optional sections per policy...

I don't understand what you find unsatisfactory about this approach - this is how you store data while maintaining referential integrity and not duplicating data. The term is "normalized"...

Because SQL is SET based, it's rather alien to procedural/OO programming concepts & requires code to transition from one realm to the other. ORMs are often considered, but they don't work well in high volume, complex systems.

OMG Ponies 2010-08-26 20:22:50

Yeah, I get the normalisation thing ;-)For such a complex structure, with some sections being simple and some having their own complex sub-structure, it seems unlikely that an ORM would work, although it would be nice.

Steve Jones 2010-08-26 21:02:59

Answer 5

+11 A:

@Bill Karwin describes three inheritance models in his SQL Antipatterns book, when proposing solutions to the SQL Entity-Attribute-Value antipattern. This is a brief overview:

Single Table Inheritance:

Using a single table as in your first option is probably the simplest design. As you mentioned, many attributes that are subtype-specific will have to be given a NULL value on rows where these attributes do not apply. With this model, you would have one policies table, which would look something like this:

+------+---------------------+----------+----------------+------------------+
| id   | date_issued         | type     | vehicle_reg_no | property_address |
+------+---------------------+----------+----------------+------------------+
|    1 | 2010-08-20 12:00:00 | MOTOR    | 01-A-04004     | NULL             |
|    2 | 2010-08-20 13:00:00 | MOTOR    | 02-B-01010     | NULL             |
|    3 | 2010-08-20 14:00:00 | PROPERTY | NULL           | Oxford Street    |
|    4 | 2010-08-20 15:00:00 | MOTOR    | 03-C-02020     | NULL             |
+------+---------------------+----------+----------------+------------------+

\------ COMMON FIELDS -------/          \----- SUBTYPE SPECIFIC FIELDS -----/

Keeping the design simple is a plus, but the main problems with this approach are the following:

When it comes to adding new subtypes, you would have to alter the table to accommodate the attributes that describe these new objects. This can quickly become problematic when you have many subtypes, or if you plan to add subtypes on a regular basis.
The database will not be able to enforce which attributes apply and which don't, since there is no metadata to define which attributes belong to which subtypes.
You also cannot enforce NOT NULL on attributes of a subtype that should be mandatory. You would have to handle this in your application, which in general is not ideal.

Concrete Table Inheritance:

Another approach to tackle inheritance is to create a new table for each subtype, repeating all the common attributes in each table. For example:

--// Table: policies_motor
+------+---------------------+----------------+
| id   | date_issued         | vehicle_reg_no |
+------+---------------------+----------------+
|    1 | 2010-08-20 12:00:00 | 01-A-04004     |
|    2 | 2010-08-20 13:00:00 | 02-B-01010     |
|    3 | 2010-08-20 15:00:00 | 03-C-02020     |
+------+---------------------+----------------+

--// Table: policies_property    
+------+---------------------+------------------+
| id   | date_issued         | property_address |
+------+---------------------+------------------+
|    1 | 2010-08-20 14:00:00 | Oxford Street    |   
+------+---------------------+------------------+

This design will basically solve the problems identified for the single table method:

Mandatory attributes can now be enforced with NOT NULL.
Adding a new subtype requires adding a new table instead of adding columns to an existing one.
There is also no risk that an inappropriate attribute is set for a particular subtype, such as the vehicle_reg_no field for a property policy.
There is no need for the type attribute as in the single table method. The type is now defined by the metadata: the table name.

However this model also comes with a few disadvantages:

The common attributes are mixed with the subtype specific attributes, and there is no easy way to identify them. The database will not know either.
When defining the tables, you would have to repeat the common attributes for each subtype table. That's definitely not DRY.
Searching for all the policies regardless of the subtype becomes difficult, and would require a bunch of UNIONs.

This is how you would have to query all the policies regardless of the type:

SELECT     date_issued, other_common_fields, 'MOTOR' AS type
FROM       policies_motor
UNION ALL
SELECT     date_issued, other_common_fields, 'PROPERTY' AS type
FROM       policies_property;

Note how adding new subtypes would require the above query to be modified with an additional UNION ALL for each subtype. This can easily lead to bugs in your application if this operation is forgotten.

Class Table Inheritance:

This is the solution that @David mentions in the other answer. You create a single table for your base class, which includes all the common attributes. Then you would create specific tables for each subtype, whose primary key also serves as a foreign key to the base table. Example:

CREATE TABLE policies (
   policy_id          int,
   date_issued        datetime,

   -- // other common attributes ...
);

CREATE TABLE policy_motor (
    policy_id         int,
    vehicle_reg_no    varchar(20),

   -- // other attributes specific to motor insurance ...

   FOREIGN KEY (policy_id) REFERENCES policies (policy_id)
);

CREATE TABLE policy_property (
    policy_id         int,
    property_address  varchar(20),

   -- // other attributes specific to property insurance ...

   FOREIGN KEY (policy_id) REFERENCES policies (policy_id)
);

This solution solves the problems identified in the other two designs:

Mandatory attributes can be enforced with NOT NULL.
Adding a new subtype requires adding a new table instead of adding columns to an existing one.
No risk that an inappropriate attribute is set for a particular subtype.
No need for the type attribute.
Now the common attributes are not mixed with the subtype specific attributes anymore.
We can stay DRY, finally. There is no need to repeat the common attributes for each subtype table when creating the tables.
Managing an auto incrementing id for the policies becomes easier, because this can be handled by the base table, instead of each subtype table generating them independently.
Searching for all the policies regardless of the subtype now becomes very easy: No UNIONs needed - just a SELECT * FROM policies.

I consider the class table approach as the most suitable in most situations.

The names of these three models come from Martin Fowler's book Patterns of Enterprise Application Architecture.

Daniel Vassallo 2010-08-26 20:59:23

+1 Very thorough answer.

Conrad Frix 2010-08-26 21:11:03

Yes, your third option, "Class Table Inheritance" is what I mentioned as my second option and it likely to be best in this case, imho.It is the only option that has a chance of sensible chance of modelling a non-trivial structure (e.g. some sections have a huge child-entity structure, which some do not).

Steve Jones 2010-08-29 18:09:01

@Steve: Yes, I agree...

Daniel Vassallo 2010-08-29 18:16:43

ansaurus

tags:

views:

answers: