views:

48

answers:

2

Trying to figure out the best way to set up collection "lists" for users given the following data (pseudo code):

user table = id, name, email

cars table = id, make, model

user_cars table = user_id, car_id, rating

collections table = id, user_id, name

Facts:

  • Users can have many cars

  • Users can have many collections

  • Individual cars can be in many collections

The 2 options I see for allowing the user to have a list of collections are to either add a field to user_cars called collection_list and make the contents of that field a comma-sep list of collections that the user owns, like: 1,30,400

or to add an additional table called collection_entries which contains collection_id, and car_id, each pointing to their respective collections.id and cars.id. The possible problem I see with adding another table is that the number of rows will get huge in that table. eg: 10,000 users x 10 collections each x 100 cars per collection = 1 million rows.

Ideas?

+1  A: 

If I understand correctly a collection has precisely 1 owner?

So:

users: user_id, name

cars: car_id, name

collection_cars: collectioncar_id, collection_id, car_id

collections: collection_id, user_id, name

So for each collection you store which cars are in them in a seperate table. I don't think 1 million rows will be such a big problem if you use indices correctly. I don't think it's a good idea in general to give up on a good database design because you're scared of having many rows. Databases are designed to handle large datasets, so that's what you can use them for. Secondly; if you store for each user which car they own, you will also have to store to which collection that car belongs, so you won't gain anything.

Lex
Thanks. I have created the other table.
k00k
+1  A: 

You second option is called an intersection table:

Intersection Table, a third, intermediate table created to implement a many-to-many relationship between two entities. Also called a junction, intermediate or resolution table. The primary keys from the two entities are placed in the intersection table as foreign keys. Commonly these two form a composite primary key.

You are going to want to use your second option for sure. This is better for query design and for using the resultSets on the client. Most databases are efficient enough to handle this design as long as you have your indexing done correctly.

northpole