views:

92

answers:

3

I have a table of electronic devices that serves as a generic superclass and then separate subclass tables for phones, laptops, cameras, etc. that extend the superclass and give specific information on the device (using the foreign key deviceID from the superclass table), in other words, Class Table Inheritance. My question is, should I have a "deviceType" column in the superclass? It would be redundant, because you could discover the device type by seeing which subclass table that particular deviceID showed up in, but that seems like it would be an unnecessary amount of joins to find out something so simple.

In general, it will probably be unusual for someone to be looking at a random generic device and wondering what type it is, but it also seems strange to leave the design where it will be difficult for that person to easily find that out.

Another issue is constraints... how can I constrain it so that the device is only subtyped once? As in, what's to stop a row being added to the Laptop table with the same deviceID as a row already existing in the Camera table? Again, this is unlikely to ever even be possible with a good UI, but database constraints are always preferable.

Thanks!

P.S. if we do use the deviceType column, to aid data integrity it will be linked to an enumeration table of all device types we support as opposed to be a free text field.

Added for clarification


First is the superclass table:

GenericDevice

deviceID    brand     shipDate
   1        HP        05/06/09
   2        Canon     11/16/08
   3        Ikon      02/27/09

And here are two example subclasses:

Laptop

laptopID    deviceID    screenSize
   1           1           17

Camera

cameraID      deviceID     megapixels
   1             2            6.5
   2             3            8

Notice the migrated key of "deviceID" from the GenericDevice table. This allows a one-to-one relationship from the generic table to any of the subclass tables depending on what type of device it actually is. So you can join these tables to see that the HP device is a laptop with a 17" screen and the Canon and Ikon devices are both cameras with 6.5 and 8 megapixels respectively.

The question is how to figure out what type a device is if you don't know in the first place, and all you have is a row in the GenericDevice table. Let's take the first row as an example. You know the device is made by HP and was shipped 05/06/09, but you don't initially know what type of device it is. But you can figure it out by searching through the child tables until you find a row with deviceID = 1. The Laptop table contains such a row, so the HP device must in fact be that laptop. But this seems like unnecessary trouble to simply discover the device type (especially if you have 20+ subclasses that you have to search through to see where deviceID matches). Instead you could just do this:

GenericDevice

deviceID    brand     shipDate   deviceType
   1        HP        05/06/09      laptop
   2        Canon     11/16/08      camera
   3        Ikon      02/27/09      camera

Now you can immediately see what type the HP device is. But now you have duplicate data (implicitly). To see why, consider if a user made this change:

GenericDevice

deviceID    brand     shipDate   deviceType
   1        HP        05/06/09      camera
   2        Canon     11/16/08      camera
   3        Ikon      02/27/09      camera

(the deviceType for the first row was switched to camera)
Now you have conflicting information - this above table says the HP device is a camera, and yet there's a row for the device in the laptop table. In other words, you are effectively storing the type of the HP device twice. Once using the deviceType field in the GenericDevice table, and once simply by the fact that there is a row in the Laptop table with deviceID = 1.

A: 

I would maintain a deviceType column in the parent table because you might have several device types mapped to the same child table.

I would maintain a duplicate of deviceType in any child tables that actually need it to speed up queries. I wouldn't assume a little join to the parent just to get deviceType will really slow things down. Also, you may wind up using many columns from the parent table, in which case one more column won't hurt anything.

The duplicate would be maintained by a trigger on insert and update.


You mention the problem of updating the deviceType causing a data integrity problem. The answer is, "don't do that". If device type can't change, then the update trigger should simply not permit a change.

John Saunders
+2  A: 

only store the type in one table, parent or child, just pick one. If you keep it in two places you'll have to keep them in sync, and why not just keep all the child info in the parent? that makes it even easier ;-) (that a joke, don't put everything in the parent table!)

EDIT

I'd put a Type column into the GenericDevice table, but I'd make it a code or numeric ID and a FK to a DeviceType table, and not a word like "laptop" or "camera". Also make sure that the Brand uses a code or ID and a FK to a brand table.

GenericDevice
deviceID    brand     shipDate   deviceType 
1           1         05/06/09   L
2           2         11/16/08   C
3           3         02/27/09   C

DeviceTypes
deviceType   deviceDescription
L            Laptop
C            camera

Brands
BrandID   BrandName
1         HP
2         Canon
3         Ikon

pick your codes/ID datatype based on expected number of different values.

For what it is worth, you can query like this (without using the deviceType column):

SELECT
    g.deviceID,g.shipDate
        ,l.screenSize
        ,c.megapixels
    FROM GenericDevice          g
        LEFT OUTER JOIN Laptop  l ON g.deviceID=l.deviceID
        LEFT OUTER JOIN Camera  c ON g.deviceID=c.deviceID
    WHERE g.deviceID=1234
KM
@JoeCool, can a parent have children of two different types? if not type goes in the parent, if so it goes in the child. I've seen many examples of "good intentions" like this (store duplicate values in other tables) cause more problems down the road...
KM
@JoeCool, without a table schema and some data examples, I'm flying blind and only guessing, based on your description of your tables. I don't have my head "around" the details of what you are working on...
KM
This is what I ended up doing. And yes, the deviceType and Brand both use ID fields and are linked to enumeration tables like you described. I simply omitted that for clarity of the question. The issue still remains of integrity between the deviceType value and which subclass a row is entered into for a given device, but I guess that will just have to be managed with the application layer or a trigger. I'm still wondering though, since there's data duplication in this design, is it not technically normalized? Should the deviceType theoretically be a computed column by searching the subclasses?
JoeCool
BTW, thank you for sticking around while I clarified the question so you could give me a fuller answer!
JoeCool
+1  A: 

"but that seems like it would be an unnecessary amount of joins to find out something so simple."

You got it.

I have been in your shoes, and I luckily still have this site bookmarked in case I needed to reference it again. It even includes information on how to set up the constraints you are looking for.

Assuming you're using SQL Server, here you go: Implementing Table Inheritance in SQL Server

James Jones
Great link! Directly addresses my problem exactly.
JoeCool