views:

116

answers:

2

Hello,

I would like to know what is the best approach to storing product ratings in a database. I have in mind the following two (simplified, and assuming a MySQL db) scenarios:

Scenario 1: Create two columns in the product table to store number of votes and the sum of all votes. Use columns to get an average on the product display page:

products(productedID, productName, voteCount, voteSum)

Pros: I will only need to access one table, and thus execute one query to display product data and ratings. Cons: Write operations will be executed in a table whose original purpose is only to furnish product data.

Scenario 2: Create an additional table to store ratings.

products(productID, productName)
ratings(productID, voteCount, voteSum)

Pros: Isolate ratings into a separate table, leaving the products table to furnish data on available products. Cons: I will have to execute two separate queries on product page requests (one for data and another for ratings).

In terms of performance, which of the following two approaches is best:

  1. Allow users to execute an occasional write query to a table that will handle hundreds of read requests?

  2. Execute two queries at every product page, but isolate the write query into a separate table.

I'm a novice to database development, and often find myself struggling with simple questions such as these.

Many thanks,

A: 

I know that my answer is not what you actually ask for, but you might want to have a chance of facilitating that new products with your system can almost never beat the old products. Say that you would get a product with 99% rating. It would be very difficult for new products to get high if you sort by products with the highest rating.

David
David, I circumvent that problem by taking the average rating (voteSum/voteCount). If I decided to emphasize newest products, I can sort by release date first, and then sort by rating. But generally speaking, I'm not concerned about how old a product is.
Mel
+1  A: 

A different table for ratings is highly recommended to keep things dynamic. Don't worry about hundreds (or thousands or tens of thousands) of entries, that's all peanuts for databases.

Suggestion:

table products
- id
- name
- etc

table products_ratings
- id
- productId
- rating
- date (if needed)
- ip (if needed, e.g. to prevent double rating)
- etc

Retrieve all ratings for product 1234:

SELECT pr.rating
FROM products_ratings pr
INNER JOIN products p
  ON pr.productId = p.id
  AND p.id = 1234

Average rating for product 1234:

SELECT AVG(pr.rating) AS rating_average -- or ROUND(AVG(pr.rating))
FROM products_ratings pr
INNER JOIN products p
  ON pr.productId = p.id
  AND p.id = 1234";

And it's just as easy to get a list of products along with their average rating:

SELECT
  p.id, p.name, p.etc,
  AVG(pr.rating) AS rating_average
FROM products p
INNER JOIN products_ratings pr
  ON pr.productId = p.id
WHERE p.id > 10 AND p.id < 20 -- or whatever
Alec
Thanks Alec, your approach makes more sense. But I also have to think about how to best integrate it with a user review. So not only can a user rate a product, but also leave a comment. I guess I can convert the ratings table into a reviews table... essentially just extend it's functionality. Thanks
Mel
A problem just occurred to me with this approach: If I turn this into a 'reviews' table, then chances are most users will only vote, and not necessarily add a review. This will leave a lot of empty cells in a table where a review title and review text should go. Is this an issue?
Mel
@Mel: With the above approach, you should also create a separate reviews table and use a join, the same way as with ratings. So your typical query would fetch the product, its ratings and its reviews.
Tom
@Mel: You _could_ create another table like Tom suggested. However, adding a rating, or adding a rating and some text are very similar things. In this case I would combine them in a single table to prevent redundancy down the road, and because there's no real advantage to split those up. Empty columns don't take up space or influence speed; it's perfectly fine (as long as they have a proper function, which is the case here). It's the same as adding a 'notes' column to the product table, but not every product might have or need a note.
Alec