views:

55

answers:

3

I'm wanting to store a wide array of categorical data in MySQL database tables. Let's say that for instance I want to to information on "widgets" and want to categorize attributes in certain ways, i.e. shape category.

For instance, the widgets could be classified as: round, square, triangular, spherical, etc. Should these categories be stored within a table to reference them best from an application? Another possibility, I would imagine, would be to add a column to widgets that contained a shape column that contained a tiny int. That way my application could search shapes by that and then use a coordinating enum type that would map the shape int meanings.

Which would be best? Or is there another solution that I'm not thinking of yet?

A: 

I think the best way is use ENUM, for example thereare pre defined enum type in mysql - http://dev.mysql.com/doc/refman/5.0/en/enum.html

aauser
+2  A: 

Define a category table for each attribute grouping. IE:

WIDGET_SHAPE_TYPE_CODES

  • WIDGET_SHAPE_TYPE_CODE (primary key)
  • DESCRIPTION

Then use a foreign key reference in the WIDGETS table:

WIDGETS

  • WIDGET_ID (primary key)
  • ...
  • WIDGET_SHAPE_TYPE_CODE (foreign key)

This has the benefit of being portable to other databases, and more obvious relationships which means simpler maintenance.

OMG Ponies
+1... @jlafay, A foreign key reference, like this answer, is the way to goDo not use the enums; that will just obfuscate the data and make future changes or debugging that much harder. For example (one of many): Think about having to rewrite the check constraint, in the enum scheme, every time a category gets added, deleted, or merged. Much more problematic than category table updates.
Brock Adams
A: 

What I would do is start with a Widgets table that has a category field that is a numeric type. If you also use the category table the numeric category is a foreign key that relates to a row in the category table. A numeric type is nice and small for better performance.

Optionally you can add a category table containing a a primary key numeric value, and a text description. This matches up the numeric value to a human friendly text value. This table can be used to convert the numbers to text if you just want to run reports directly from the database. The nice thing about having this table is you don't need to update an executable if you add a new category. I would add such a table to my design.

MySQL's ENUM is handy but it stores int the table as a string so it uses up more space in the table than is really needed. However it does have the advantage of preventing values that are not recognized from being stored. Preventing the storage of invalid numeric values is possible, but not as elegantly as ENUM. The other problem with ENUM is because it is regarded as a string, the database must do more work if you are selecting by the value because instead of comparing a single number, multiple characters have to be compared.

If you really want to you can have an enumeration in your code that coverts the numeric category back into something more application code friendly, but you are making your code more difficult to maintain by doing this. However it can have a performance advantage because fewer bytes have to be returned when you run a query. I would try to avoid this because it requires updating the application code every time a category is added to the database. If you really need to squeeze performance out of the database you could select the whole category table, and select the widgets table and merge them in application code, but that is a rare circumstance since the DB client almost always has a fast connection to the DB server and a few more bytes over the network are insignificant.

William Leader