views:

75

answers:

3

What is the best way to emulate Tagged union in databases? I'm talking about something like this:

create table t1 {
  vehicle_id INTEGER NOT NULL REFERENCES car(id) OR motor(id) -- not valid
  ...
}

where vehicle_id would be id in car table OR motor table, and it would know which.

(assume that motor and car tables have nothing in common0

+3  A: 
CREATE TABLE vehicle (type INT NOT NULL, id INT NOT NULL,
             PRIMARY KEY (type, id)
)

CREATE TABLE car (type INT NOT NULL DEFAULT 1, id INT NOT NULL PRIMARY KEY,
             CHECK(type = 1),
             FOREIGN KEY (type, id) REFERENCES vehicle
)

CREATE TABLE motorcycle (type INT NOT NULL DEFAULT 2, id INT NOT NULL PRIMARY KEY,
             CHECK(type = 2),
             FOREIGN KEY (type, id) REFERENCES vehicle
)

CREATE TABLE t1 (
  ...
  vehicle_type INT NOT NULL,
  vehicle_id INT NOT NULL,
  FOREIGN KEY (vehicle_type, vehicle_id) REFERENCES vehicle
  ...
)
Quassnoi
OMG Ponies
`@OMG Ponies:` With this layout, you don't need to reference `vehicle` at all. You can just join with `cars` or `motorcycles`, depending on the `type`. `vehicle` here serves only to police the relationships.
Quassnoi
+3  A: 

Some people use a design called Polymorphic Associations to do this, allowing vehicle_id to contain a value that exists either in car or motor tables. Then add a vehicle_type that names the table which the given row in t1 references.

The trouble is that you can't declare a real SQL foreign key constraint if you do this. There's no support in SQL for a foreign key that has multiple reference targets. There are other problems, too, but the lack of referential integrity is already a deal-breaker.

A better design is to borrow a concept from OO design of a common supertype of both car and motor:

CREATE TABLE Identifiable (
 id SERIAL PRIMARY KEY
);

Then make t1 reference this super-type table:

CREATE TABLE t1 (
  vehicle_id INTEGER NOT NULL,
  FOREIGN KEY (vehicle_id) REFERENCES identifiable(id)
  ...
);

And also make the sub-types reference their parent supertype. Note that the primary key of the sub-types is not auto-incrementing. The parent supertype takes care of allocating a new id value, and the children only reference that value.

CREATE TABLE car (
  id INTEGER NOT NULL,
  FOREIGN KEY (id) REFERENCES identifiable(id)
  ...
);

CREATE TABLE motor (
  id INTEGER NOT NULL,
  FOREIGN KEY (id) REFERENCES identifiable(id)
  ...
);

Now you can have true referential integrity, but also support multiple subtype tables with their own attributes.


The answer by @Quassnoi also shows a method to enforce disjoint subtypes. That is, you want to prevent both car and motor from referencing the same row in their parent supertype table. When I do this, I use a single-column primary key for Identifiable.id but also declare a UNIQUE key over Identifiable.(id, type). The foreign keys in car and motor can reference the two-column unique key instead of the primary key.

Bill Karwin
The surrogate key for `identifiable` is only good when there are attributes in `identifiable` that the query needs to select. If `identifiable` only serves to enforce the constraints, using a composite key will allow to get rid of it in the queries competely.
Quassnoi
+2  A: 

I think you could model such a reference by using table inheritance in PostgreSQL.

If you really need to know where a row comes from in a Query, you could use a simple UNION ALL statment like (this possibility has nothing to do with table inheritance):

SELECT car.*, 'car' table_name
UNION ALL
SELECT motor.*, 'motor' table_name
Karl Bartel