views:

398

answers:

4

I am redeveloping the front end and database for a medium size products database so that it can support categories/subcategories, product lines, manufacturers, supported software and product attributes. Right now there is only a products table. There will be pages for products by line, by category/subcategory, by manufacturer, by supported software (optional). Each page will have additional filtering based on the other classifications.

Categories/Subcategories (multi level) Products and product lines can be assigned to multiple category trees. Up to 5 levels deep should be supported.

Product lines (single level) Groups of products. Product can only be in single product line.

Manufacturers (single level) Products and product lines can be assigned to single manufacturer.

Supported software (single level) Certain products only work with one or more softwares, so a product/line can be assigned to none, one or more softwares.

Attribues (type / options - could be treated so each type is a category and items are children) Products and product lines can be assigned attributes (eg - color > red / blue / green). Attributes should be able to be assigned to one or more categories.

Since all these items are basically types of subcategories, do I put them all together in a master table OR split them into separate tables for each one?

Master table idea:

ClassificationTypes (product line, category/sub, manufacturer, software, attribute would all be types)
-TypeID
-Name

Classifications
-ClassID
-TypeID
-ParentClassID
-Name

ClassificationsProductsAssociations
-ProductID
-ClassID

I would still need at least one more table to link types together (eg - to link attributes to a category) and a way to link product lines to various types.

If I go with a table for each type it can get messy quick and I will still need a way to link everything together.

Multiple table setup:

Categories
-CategoryID
-Name
-ParentCategoryID

CategoriesAssociations
-CategoryID
-ProductID
-ProductLineID ?

Attributes
-AttributeID
-Name
-ParentAttributeID (use this as the parent would be "color" and child would be "red")

AttributesAssociations
-AttributeID
-ProductID
-CategoryID (do I also need to link the category to the parent attribute?)

CompatibleSoftware
-SoftwareID
-Name

CompatibleSoftwareAssociations
-SoftwareID
-ProductID
-ProductLineID ?

Manufacturers
-ManufacturerID
-Name

ProductLines
-ProductLineID
-ManufacturerID
-Name

Products
-ProductID
-ProductLineID
-ManufacturerID
-Name

Other option for associations is to have a single associations table to link the tables above:

Master Associations
-ProductID
-ProductLineID
-ManufacturerID
-CategoryID
-SoftwareID
-AttributeID

What is the best solution?

+2  A: 

Go for multiple tables, it makes the design more obvious and more extensible, in my opinion. While it may fit your solution now, further changes may be more difficult.

Paddy
To play the devils advocate: Its not more extensible if new type require new tables -- its a hellish nightmare in the long run.
Hassan Syed
How to easily link everything together if I use all separate tables? Separate tables for the associations of each item or one big association table that has column for each "type"?
Developr
@Hassan I would argue the opposite - the 'one big table' approach has led to some very hard to debug databases (in my experience). A table with many columns, some of which only apply to certain 'types' of rows is not good DB design.
Paddy
@devlopr I'd keep it as separate as possible - hard to know without really getting into your requirements. Running low on caffiene here and having difficultly parsing your requirements from above.
Paddy
@paddy well even if you use a single/general serialization structure, you still need the concept of higher-level schemas; ofcourse if the model you serialize into it is structured in an ad-hoc way, you will run into trouble. But hey, this is advanced data modeling -- so unless developr is trying to compete with Google Products I agree and wouldn't worry too much about it :D
Hassan Syed
A: 

I agree to Paddy. It makes your life easier in the future and you are much more flexible. You might want to put in stock control and other stuff. To link everything together use the id's (integer) parent/child of the tables.

Chris-NTA
A: 

I think multiple tables is the way to go, but to really know, do this: Flesh out the design for both ways and then take a sample of 5-10 products.

Populate the tables in both designs for the 5-10 products.

Now start writing the queries for both ways. You will start to see which is easier to write (the single table I bet), and you might find cases that only work in one design (the multi-table I bet.)

When you are done you have not lost the work -- you can use the table schema to move forward and some of your queries will already be written.

If you get to a query that does not make sense, seems to complicated, or such you can post it here and get feed back -- having real code always gets better comments.

Hogan
A: 

Just wanted to post my decision and since I was not satisfied with any of the answers provided, I have elected to answer my own question.

I ended up setting up a a single set of tables:

Classification Types (eg - product lines, categories, manufacturers, etc)

Classifications (supporting parent/child adjacency list, nested sets, and materialized path all at once in order to take advantage of strengths of each. I have a SQL CTE that can populate all the fields in one go when the data changes)

Classifications Relations (with ability to relate products to classifications, relate classifications to other classifications and also relate classifications to other types)

I will admit that the solution is not 100% normalized, but this setup gives me ultimate flexibility to expand by creating new types and is very powerful and easy to query.

Developr