views:

222

answers:

7

Say you have an entity like a vehicle that you are capturing detailed information about. The car you want to capture is painted red, black and white. The front tires are Bridgestone 275/35-18 and the rear tires are 325/30-19. And sometimes you can have just two tires (yes this would be considered a motorcycle which is a type of vehicle) and sometimes 18 tires that could all be different. Then there are some fields that are always single valued like engine size (if we let our imaginations run wild we can think of multi-engined vehicles but I am trying to keep this simple).

Our current strategy for dealing with this is to have a table for each of the fields that can have multiple values. This will spawn a large number of tables (we have a bunch of different entities with this requirement) and smells a little bad. Is this the best strategy and if not, what would be better?

+1  A: 

If it's a possibility for your app, you might want to look into couchdb.

Jerph
A: 

You're describing a Star Schema. I think its fairly standard practice in your kind of case

Edit: Actually your schema is slightly modified from the Star Schema, you use the primary key of the fact table in each of the dimension tables to join on so you can have multiple paint colors etc. Either way I think it's a fine way to deal with your entity. You may go one step further and normalize the dimension tables and then you'd have a Snowflake Schema

Jason Punyon
A: 

It seems like you may be looking at something called Hierarchical Model.

Or maybe a simple list of (attr, value) pairs will do?

squadette
A: 

If you're using SQL Server, don't be afraid to store the XML Data Type. I have found that it makes things like this much, much easier.

Jarrett Meyer
A: 

It really depends on whether the variables themselves only have one variable (example: you can have a variable number of tires that are all the same type, or a set number of tires that are of variable type).

Since you seem to need to have multiple variables (eg. specific type for each tire, with a variable number of tires), I am afraid the best solution is to have specific tables for each specific area of the car you wish to customize.

If you have some fields that simply have a set of values to chose between (say, 2, 4 or 6 windows), you can simply use an enum or define a new field-type using User-Defined Domains (depending on which DBMS you're using).

Christian P.
+1  A: 

If you're using a relational database, your suggestion is pretty much the only way to do it. The theory of normal forms will give you more information about it - the Wikipedia articles about it are quite good, though slightly heavy going simply because it is a tricky theoretical subject when you get into the higher normalisation levels. The examples are mostly common sense though.

Assuming you have a Vehicle table, a Colour table and a TyreType table (sorry for the British spelling), you are presumably defining a VehicleTyre and VehicleColour table which acts as a join between the relevant pairs of tables. This structure is actually quite healthy. It not only encapsulates the information you want directly, but also lets you capture in a natural way things like which tyre is which (e.g. front left is Bridgestone 275/35-18) or how much of the car is painted red (e.g. with a percentage field on the VehicleColour table).

You may want to model a vehicle type entity which could govern how many tyres there are. While this is not necessary in order to get working SELECT queries out of the system, it is probably going to be useful both in your user interface and figuring out how many tyres to insert into your tables.

My company has lots of schemas which operate on exactly this basis - indeed our object-relational framework creates them automatically to manage many-to-many relationships (and sometimes even for one-to-many relationships depending on how we model them). Several of our apps have over 150 entities and over 100 of these join tables. There are no performance problems and no meaningful impact on manageability of the data, except that a few of the table names are annoyingly long.

Leigh Caldwell
A: 

Your current strategy is the correct one. You're tracking so many kinds of data, so you'll need lots of tables. That's just how it is. Is the DBMS complaining?

Seun Osewa