views:

34

answers:

1

My items are listed as follows; This is just a summary of course. But I'm using a method shown for the "Detail" table to represent a type of 'inheritence', so to speak - since "Item" and "Downloadable" are going to be identical except that each will have a few additional fields relevant only to them.

My question is in this design pattern. This sort of thing appears many, many times in our projects - is there a more intelligent way to handle it? I basically need to normalize the tables as much as possible. I'm extremely new to databases and so this is all very confusing to me.

There are 5 items. Awards, Items, Purchases, Tokens, and Downloads. They are all very, very similar, except each has a few pieces of data relevant only to itself. I've tried to use a declaration field (like an enumerator 'Type' field) in conjunction with nullable columns, but I was told that is a bad approach. What I have done is take everything similar and place it in a single table, and then each type has its own table that references a column in the 'base' table.

The problem occurs with relationships, or junctions. Linking all of these back to a customer. Each type takes around 2 additional tables to properly junction all of the data together- and as such, my database is growing very, very large. Is there a smarter practice for this kind of behavior?

Item
ID      | GUID
Name      | varchar(64)

Product
ID      | GUID
Name      | varchar(64)
Store     | GUID [ FK ]
Details  | GUID [FK]

Downloadable
ID      | GUID
Name      | varchar(64)
Url    | nvarchar(2048)
Details | GUID [FK]

Details
ID           | GUID
Price         | decimal
Description | text

Peripherals [ JUNCTION ]
ID      | GUID
Detail      | GUID [FK]

Store

ID      | GUID
Addresses   | GUID

Addresses
ID      | GUID
Name        | nvarchar(64)
State    | int [FK]
ZipCode | int
Address | nvarchar(64)


State
ID      | int
Name        | varchar(32)
+1  A: 

This sort of inheritance is always a bit trick with relational databases. What you have is one approach and it is the most traditional approach to the problem. You end up doing a lot of crossing of tables, but that might be just fine.

Another approach is to employ some denormalization and collapse the tables into one table. Include a type field which represents the type of the item and then have the union of the fields all in that table. So you would have a table like

ID  | GUID
Type | GUID [FK]
Name        | nvarchar(64)
State    | int [FK]
ZipCode | int
Address | nvarchar(64)
Name      | varchar(64)
Url    | nvarchar(2048)
Store     | GUID [ FK ]
Details  | GUID [FK]
...

This means you have a bunch of empty fields in your table.

You could also take a more fragmented approach and construct your tables like

Item:
ID | GUID

ItemPropertyType:
ID | GUID
Name | nvarchar(50)

ItemProperty:
ID | GUID
ItemID | GUID [FK]
ItemPropertyTypeID | GUID
charValue | varchar(64)

Each item property references an item. To construct an item you simply gather up the ItemProperties which it owns. If you wanted to find all items where Name was 'bill' then you could do

select ItemID from ItemProperties ip, ItemPropertyTypes ipn where ipn.ID = ip.ItemPropertyTypeID and ipt.Name='Name' and ip.charValue='bill'

Jeff actually blogged a bit about this topic http://www.codinghorror.com/blog/2008/07/maybe-normalizing-isnt-normal.html

stimms
Your first approach ties business meaning to a row (type #3 = "Download"). This is generally not a good practice as the model does not fully reflect the design. Your second approach is metadata-based, which is fine if that's what needed, but it tends to perform poorly with a lot of data in the system, and requires some data maintenance (orphan rows, etc.). This approach is also called Entity-Attribute-Value or EAV.
Phil Sandler
Sorry, reading that back sounded more preachy and pedantic than I intended. :)
Phil Sandler