views:

612

answers:

3

I've been trying to design a database schema for a side project but I havent been able to produce anything that I'm comfortable with. I'm using ASP.Net with LINQ for my data access:

I'm going to allow users to specify up to 10 "items" each with 2 numeric properties, and 1 referential property, the item name.

If I were to put this entry into 1 row, it would easily equal out to some 30+ columns (minimum), e.g. item_1_name (ref) item_1_weight item_1_volume item_2_name... etc...

And I can't simply turn these columns into referential tables as each property can essentially range from 1 to 400+.

I also figured that if a user only decides to put 1 item into their entry, the method of which I create the object for that data will be static as with LINQ I'd have to check whether the properties and whatnot are NULL and work accordingly. Also, if I ever wanted to increase the number of items allowed in an entry, it'd be a headache to work with.

The other option I've thought of is simply creating a row for each item and tying it with an entry id. So I'd essentially never have null entries, but my table would grow astronomically deep but not very wide, as there would only be some 5 odd columns.

Is there something I'm overlooking in my design/is there a much better and efficient way of doing this?

EDIT: When I say that it will grow astronomically, I mean it in this sense: A user can create an entry, and each entry will most likely have a group of items. So say they make 1 entry a day to the site, they could have 3 groups of items, with the max number of items (10), which would equate to 30 items for that sole entry. Make an entry everyday for a week at that rate and you could have 210 rows for that single user.

A: 

use a single item table:

userId, itemIndex, isReference, numericValue, referenceValue

this way the value for item_3_name for user 999 translates to

999,3,true,null,value

You will have to enforce certain constraints yourself, s.a. the maximal number of items per user, etc.

Manu
You're describing the EAV design, which has a lot of problems with scalability and referential integrity. For example, how do you declare a foreign key on the referential property?
Bill Karwin
+2  A: 

I'd recommend the latter design you mention, create one dependent table with five columns:

CREATE TABLE Items (
  user_id               INTEGER NOT NULL,
  item_id               INTEGER NOT NULL DEFAULT 1,
  numeric_property1     INTEGER,
  numeric_property2     INTEGER,
  referential_property  INTEGER,
  PRIMARY KEY (user_id, item_id),
  FOREIGN KEY (user_id) REFERENCES Users(user_id)
                        ON DELETE CASCADE,
  FOREIGN KEY (item_id) REFERENCES num_items(item_id),
  FOREIGN KEY (referential_property) REFERENCES some_other_table(some_column)
);

I show a table num_items above, which contains the numbers 1 through 10 if you want to restrict users to 10 items at most:

CREATE TABLE num_items (item_id INTEGER NOT NULL );
INSERT INTO num_items (item_id) 
  VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);

Advantages of this design is that it's easy to COUNT() how many items a given user has, it's easy to compute things like MIN() and MAX() for a given property, you can enforce a foreign key for the referential property, etc.

Some databases have a feature to declare the second part of a compound primary key (item_id in this case) as auto-incrementing, so if you specify the value for entity_id but omit item_id it automatically gets the next unused value (but does not fill gaps if you delete one). You don't state which brand of database you're using so I'll leave it to you to figure out this feature.

edit: As Tony Andrews says in his answer, the number of rows is not a problem. You don't state which brand of database you're intending to use, but unless you choose an especially feeble product like MS Access, you can rely on the database to process millions of rows easily. If you choose indexes well, and write queries that use those indexes, efficiency shouldn't be a problem.

Bill Karwin
Wouldn't it be better to use a CHECK constraint for item_id instead of a foreign key?
Dave
Check my edit. I agree, but I'm still concerned with the efficiency of what I'll be doing as the number of rows is going to be huge.
MunkiPhD
Yes, you could use a CHECK constraint but when you need to expand to 12 items, modifying a CHECK constraint is a metadata change (ALTER TABLE), and that may be expensive. Inserting values (11), (12) is an easy data change.
Bill Karwin
A: 

Proper database design would be to store each user/item in a separate row. This will be much easier to work with, and removes the arbitrary restriction of 10 items. I wouldn't say it will grow "astronomically deep", there will be around 10 x (no. of users) rows.

Tony Andrews