views:

108

answers:

5

I have an online shop where users can have little shops with their own products. Each of this products can have questions associated to it and the owner of the shop has the ability to answer those questions. This information is stored in 3 tables a "Questions"(QuestionID,ProductID,...) table, a "Products"(ProductID,ShopID,...) table and a "Shop"(ShopID,OwnerID,...) table.

Is it better to have a ShopID in the 'Questions' table (to allow a shop owner to view all his questions) or to join those three tables to get Questions matching a certain Shop?

+2  A: 

Generally it is better to avoid redundant information. This seems like it should be quite a cheap join to do given appropriate indexes and I wouldn't denormalise in that manner unless I saw in the query plans that the JOIN was causing problems (perhaps because of the number of records in the tables)

You would also need to consider the ratio of reads to writes. Denormalisation will help the reads but add overhead to writes.

Martin Smith
the join will be quite cheap only for small databases. If you are considering cardinality for a index on shopID in products table, then time required for join might be significant.
narcisradu
@narcisradu - Yes I have had to resort to this before but the point I was making is that it should be done only when the execution plans show a case for it.
Martin Smith
+1  A: 

You should have a many to many relationship between the questions and the products:

questions_ref(question_id, question_code, question)

product_questions(pquestion_id, question_id_fk, product_id_fk)

products(product_id, product_name, etc)

If it is possible for the product to be in more than one shop (which im certain it is) you should also have a many to many relationship between the shops and products.

shop_products(sproduct_id, product_id_fk, shop_id_fk, sproduct_price, other_shop_specific_param)

shops(shop_id, owner_id_fk, shop_name, etc)

DRL
I don't think a many to many relationship is required here. Also, the tables are one-to-many, so it might be subject of denormalization.
narcisradu
Just a note; in case you are confused, the 'question answer' would be a column in the product_questions table
DRL
@narcisradu m2m is clearly required in this case; shop can have many products - a product can be at many shops: a question is on many products - a product can have many questions.
DRL
@DRL as I said in a previous comment, if the shop owners are different, then we are not talking about the same product. Please think outside the box, not just from a database normalization perspective.
narcisradu
@narcisradu I have replied to your previous comment
DRL
@DSL - this design would be better for a resellers network as you have products separated from the stores and this way you can modify a product for all the stores but in OP's case, the products (and questions) are not related between the shops (even if it's the same product). Every store will want to have it's own description, price or code and mainly answers for the product and won't want this to be modified by another store so there is no need for such "global" products so I think it's better to keep the design simple like the OP's design.
laurent-rpnet
narcisradu is right. In this particular case neither products nor questions should have M:M relations because even though everything is on the same page virtually shops are completely separated.
Omar Kohl
+1  A: 

Hi Omar,

i think your design is okay. I wont add ShopID to table Questions. You should use a join, where necessary.

BTW: You shoud use a m:n relation between products and shops and remove ShopID for Products. So you can have the same product in differnent shops and also the same questions for a product.

Regards, Lars

Lars
he definitely should avoid use a many-to-many relationship between products and shops if the shop owners are different. Imagine there is the same product but the price differs or any other attribute differs.
narcisradu
@narcisradu so you would have a products table for each shop? its very simple to add shop specific parameters to the shop_products() table in my example shop_products(..., sProduct_price, sProduct_stock)
DRL
@DRL: While technically ok your M2M between shops and products is probably undesirable. As a shop owner I would want my data to be completely separate from another shop owner's data even though both sets of data live in the same database. And no, separate product tables per shop are nonsense, but yes, you do want a 1-to-M relation between shop and product. This prevents data entanglement between shops and will greatly simplify import and export of product data for a single shop. That is important, because as a shop owner I want to set up quickly and be able to leave quickly.
Marjan Venema
@DRL: no, he has already `shopid` in the product table. From what I understand here, there is no "global" product. Each shop has it's own products (or maybe a few shops will have the same real product but they will be considered different because not related). As the OP said it's an online shop portal with different and probably not related online shops inside so there is no relation between the product even if the real product is the same.
laurent-rpnet
+2  A: 

From a design point of view, storing redundant data is not necessary. In your case it might be. Try to make some tests and if the query time is improved due this redundancy then you should proceed with the denormalization.

narcisradu
+8  A: 

It is almost always better to join and avoid redundant information. You should only denormalize when you must do so in order to meet a performance goal - and you can't know if you need to do this until you try with normalized tables first.

Note that denormalization helps in read performance at the expense of slowing down writes and making it easier for a coding mistake to cause data to be out of sync (since you're storing the same thing in more than one place you now have to be sure to update it all).

Donnie