views:

107

answers:

2

To describe my dilemma, let me first start with an example problem (stolen from here). Let's say you have a GradStudent table in your database that looks like this:

GradStudent:
firstName
lastName
birthDate
courseAssignment
researchGrant

But only teaching assistants will have a course assignment and only research assistants will have a researchGrant, so one of those two will always be null. Obviously this is not optimal and it would be better to do this:

GradStudent:
firstName
lastName
birthDate

TeachAsst:
courseAssignment

ResearchAsst:
researchGrant

Where TeachAsst and ResearchAsst have a foreign key (probably a "studentID" surrogate) from the GradStudent table.

I also understand why it wouldn't be best to make two completely separate tables like:

TeachAsst:
firstName
lastName
birthDate
courseAssignment

ResearchAsst:
firstName
lastName
birthDate
researchGrant

Because you're repeating a lot of attributes that have the same meaning.

However, two distinct classes would make sense (I think) if they had hardly any fields in common, like:

TeachAsst:
name
courseAssignment
payRate
numStudents

ResearchAsst:
name
researchGrant
facultyAdvisor
researchTopic

Here, they only have "name" in common, so would it be silly to have a GradStudent superclass with only a single attribute of "name?" Where is the tipping point? How do you decide when to have a superclass of common information, or when to leave two classes completely separate? Having a superclass makes most of CRUD a bit harder because to create or update a TeachAsst you need to change two tables instead of just one.

As another example, let's say the DB you're working on involves measuring information on different electronic devices. And while a camera and a mobile phone have length/width/height in common, most of the other measurements will not coincide (e.g. the camera won't have any audio information, and the mobile phone won't have any lens or viewport measurements). So it seems almost simpler to have a cameraData table and a mobileData that are completely separate, rather than put their little amount of common information into a superclass table. What do you think? Is there a general rule that says you should always put common data together in a superclass, even if it's a small percentage of the subclass's descriptive data?

Edit: Let's assume that in the grad student example, a grad student is either a teaching assistant or a research assistant, will never switch roles, and also is never both or neither.

A: 

In the GradStudent scenario you have the following property:

A GradStudent can be TeachAsst first and become ResearchAsst later. Or she can be both at the same time.

In this situation, denormalization might not be a good idea.

Yet in your case, you measure cameas and mobile phones. They will never become something else. I think you could risk the denormalization for the sake of less complexity.

Or, you could even think about using a documend db like CouchDB, in which you do not have to follow any schema.

Pierre Spring
+1  A: 

I consider myself relatively new to database design, so take this for what it's worth. In the first example, my first thought would be to indeed maintain a separate "GradStudent" table which would include name and other personal information. In my opinion, it leaves you flexible for potential changes in the future. For example, what if another GradStudent role is created which can be held by an individual in addition to either TeachAsst or ResearchAsst? You could create a "GradStudent_Relationship" table that could accommodate additional roles in the future such that:

GradStudent_Relationship:
GradStudent_ID (fk)
ResearchAsst_ID (fk)
TeachAsst_ID (fk)
NewGradStudentRole_ID (fk)

As for making your CRUD operations tougher, in my opinion the added flexibility outweighs that concern. Perhaps you could set up triggers within your database to help with that?

Regarding the second example, why can't a camera have audio? Don't some digital cameras record video that includes audio? Also, why can't a mobile phone have a lens or viewport measurement? Don't many mobile phones now include cameras?

For what it's worth, I sometimes find it helpful to abstract the "classes" as best I can in order to maintain the most flexibility down the line. There probably is some trade off there in terms of CRUD operations as you mention, but personally, I like knowing the database schema can handle potential changes in the future.

I hope this was at least somewhat helpful.

Colin
+1 Definitely helpful! I'm going to think about the issue a little more myself, but I'll take what you said into consideration.
JoeCool