views:

1123

answers:

6

Let's say I have a table that represents a super class, students. And then I have N tables that represent subclasses of that object (athletes, musicians, etc). How can I express a constraint such that a student must be modeled in one (not more, not less) subclass?

Clarifications regarding comments:

  • This is being maintained manually, not through an ORM package.
  • The project this relates to sits atop SQL Server (but it would be nice to see a generic solution)
  • This may not have been the best example. There are a couple scenarios we can consider regarding subclassing, and I just happened to invent this student/athlete example.

A) In true object-oriented fashion, it's possible that the superclass can exist by itself and need not be modeled in any subclasses.

B) In real life, any object or student can have multiple roles.

C) The particular scenario I was trying to illustrate was requiring that every object be implemented in exactly one subclass. Think of the superclass as an abstract implementation, or just commonalities factored out of otherwise disparate object classes/instances.

Thanks to all for your input, especially Bill.

A: 

Here are a couple of possibilities. One is a CHECK in each table that the student_id does not appear in any of the other sister subtype tables. This is probably expensive and every time you need a new subtype, you need to modify the constraint in all the existing tables.

CREATE TABLE athletes (
  student_id INT NOT NULL PRIMARY KEY,
  FOREIGN KEY (student_id) REFERENCES students(student_id),
  CHECK (student_id NOT IN (SELECT student_id FROM musicians 
                      UNION SELECT student_id FROM slackers 
                      UNION ...)) 
);

Probably a better way is to make the primary key in the students table a compound primary key, the second column denotes a subtype. Then restrict that column in each child table to a single value corresponding to the subtype represented by the table. edit: no need to make the PK a compound key in child tables.

CREATE TABLE athletes (
  student_id INT NOT NULL PRIMARY KEY,
  student_type CHAR(4) NOT NULL CHECK (student_type = 'ATHL'),
  FOREIGN KEY (student_id, student_type) REFERENCES students(student_id, student_type)
);

Of course student_type could just as easily be an integer, I'm just showing it as a char for illustration purposes.

If you don't have support for CHECK constraints (e.g. MySQL), then you can do something similar in a trigger.

edit: I read your followup about making sure a row exists in some subclass table for every row in the superclass table. I don't think there's a practical way to do this with SQL metadata and constraints.

The only option I can suggest to meet this requirement is to use Single-Table Inheritance. Otherwise you need to rely on application code to enforce it.

Bill Karwin
A: 

interesting problem. Of course the FK constraints are there for the subtables so there has to be a student for those.

The main problem is trying to check as it is inserted. The student has to be inserted first so that you don't violate a FK constraint in a subtable so a trigger that does a check wouldn't work.

You could write an app that checks now and then if you are really concerned about this. I think the biggest fear though would be deletions. Someone could delete a subtable entry but not the student. You could have triggers to check when items are deleted from the subtables since that is probably the biggest problem.

I have a db with a table per subclass hierarchy like this as well. I use Hibernate and its mapped properly so it deletes everything automatically. If doing this by 'hand' then I would make sure to always delete the parent with proper cascades hehe :)

Arthur Thomas
A: 

Thanks, Bill. You got me thinking...

The superclass table has a subclass code column. Each of the subclass tables has a foreign key constraint, as well as one that dictates that the id exist with a subset of the superclass table (where code = athlete).

The only missing part here is that it's possible to model a superclass without a subclass. Even if you make the code column mandatory, it could just be an empty join. That can be fixed by adding a constraint that the superclass's ids exist in a union of the ids in the subclass tables. Insertion gets a little hairy with these two constraints if constraints are enforced in the middle of transactions. That or just don't worry about unsubclassed objects.

Edit: Bleh, such a good sounding idea... But impeded by the fact that subqueries that refer to other tables aren't supported. At least not in SQL Server.

Mark Canlas
A: 

That can be fixed by adding a constraint that the superclass's ids exist in a union of the ids in the subclass tables.

Depending on how much intelligence you want to put into your schema (and how much MS SQL Server lets you put there), you wouldn't actually need to do a union of the subclass tables, since you know that, if the id exists in any subclass table, it must exist in the same subclass as the one identified by the subclass code column.

Dave Sherohman
(a) you can't query another table in a constraint in MS SQL Server, and (b) you can't query a table whose name depends on a data value.
Bill Karwin
+2  A: 

Each Student record will have a SubClass column (assume for the sake of argument it's a CHAR(1)). {A = Athlete, M=musician...}

Now create your Athlete and Musician tables. They should also have a SubClass column, but there should be a check constraint hard-coding the value for the type of table they represent. For example, you should put a default of 'A' and a CHECK constraint of 'A' for the SubClass column on the Athlete table.

Link your Musician and Athlete tables to the Student table using a COMPOSITE foreign key of StudentID AND Subclass. And you're done! Go enjoy a nice cup of coffee.

CREATE TABLE Student (
    StudentID INT NOT NULL IDENTITY PRIMARY KEY,
    SubClass CHAR(1) NOT NULL,
    Name VARCHAR(200) NOT NULL,
    CONSTRAINT UQ_Student UNIQUE (StudentID, SubClass)
);

CREATE TABLE Athlete (
    StudentID INT NOT NULL PRIMARY KEY,
    SubClass CHAR(1) NOT NULL,
    Sport VARCHAR(200) NOT NULL,
    CONSTRAINT CHK_Jock CHECK (SubClass = 'A'),
    CONSTRAINT FK_Student_Athlete FOREIGN KEY (StudentID, Subclass) REFERENCES Student(StudentID, Subclass)
);

CREATE TABLE Musician (
    StudentID INT NOT NULL PRIMARY KEY,
    SubClass CHAR(1) NOT NULL,
    Instrument VARCHAR(200) NOT NULL,
    CONSTRAINT CHK_Band_Nerd CHECK (SubClass = 'M'),
    CONSTRAINT FK_Student_Musician FOREIGN KEY (StudentID, Subclass) REFERENCES Student(StudentID, Subclass)
);
Dave Markle
+1. This does actually work.
BobbyShaftoe
+1  A: 

If you are interested in data modeling, in addition to object modeling, I suggest you look up "relational modeling generalization specialization" on the web. There used to be some good resources out there that explains this kind of pattern quite well. I hope those resources are still there.

Here's a simplified view of what I hope you'll find.

Before you begin designing a database, it's useful to come up with a conceptual data model that connects the values stored in the database back to the subject matter. Making a conceptual data model is really data analysis, not database design. Sometimes it's difficult to keep analysis and design separate.

One way of modeling data at the conceptual level is the Entity-Relationship (ER) model. There are well known patterns for modeling the specialization-generalization situation. Converting those ER patterns to SQL tables (called logical design) is pretty straightforward, although you do have to make some design choices.

The case you gave of a student having possibly several roles like musician probably doesn't illustrate the case you are interested in, if I read you right. You seem to be interested in the case where the subclasses are mutually exclusive. Perhaps the case where a vehicle might be an auto, a truck, or a motorcycle might be easier to discuss.

One difference you are likely to encounter is that the general table for the superclass doesn't really need the type code coulumn. The type of a single superclass instance can be derived by the presence or absence of foreign keys in the various subclass tables. Whether it's smarter to include or omit the type code depends on how you intend to use the data.

Walter Mitty