views:

70

answers:

2

I have a fairly common scenario where a user can select from a set of attributes. The attributes in the UI are represented by checkboxes.

For example:

Components: Harddrive(y/n), CPU(y/n), Monitor(y/n), Keyboard(y/n), etc....

In the past I have generally modeled this scenario like this:

"PCs" 1:M "PC Components" M:1 "Components"

Another option is to make the "attributes" as y/n fields in the "PCs" table.

e.g.

PCs (table)
-----------
PCId(PK)
Harddrive(y/n)
CPU(y/n)
etc...

In the past my rationale for going with one vs the other is based on whether the user can enter new attributes. If the answer is yes, then I go with the first option, if the answer is no, then I usually go with y/n attributes.

However now I have a scenario where there are about 20 attributes divided into multiple categories. After creating the ERD, it just looks "wrong" and the table has an absurd number of columns.

My question is, is there a standard/correct way to model this? If so, does it have a name?

+1  A: 

I follow this simple rule: One piece of data per column. Let the database optimize the storage structure.

In the SQL Server world, therefore, I go with the BIT datatype for this, and yes, separate columns. Other databases, I'm certain, have a corresponding type.

John Rudy
+2  A: 

One is tempted to design a more "compact" data model (something other than a column per attribute) because the attributes have compatible data types (they're all y/n).

If the attributes each had different data types, for example if they were restricted to a set of values using distinct lookup tables, it would be a no-brainer that you must use a column per attribute.

See Domain-Key Normal Form. Modeling your y/n attributes as rows means there's no way to represent mandatory attributes (for which you must have either Y or N value). So you'd have some constraint that there must be N rows for N attributes. A constraint on the minimum number of rows is neither a domain constraint nor a key constraint, therefore it fails the DKNF test.

It's not necessary that every table must conform to DKNF, but if you're asking what term describes the design of a column-per-attribute, I suggest that "Domain/Key Normal Form" would fit.

Bill Karwin
Thx for response. What would you suggest if I have about 50 of these attributes? Keep them on same table? The table will have about 100 attributes total then. Or move to 1-1?
B Z
I recommend modeling the database according to rules of normalization, unless and until it becomes a measurable performance bottleneck.
Bill Karwin
It is one of those "design smells" when I see these many columns, but the data is normalized. Thank you EPA for all these attributes..
B Z