views:

75

answers:

3

I have a table in which I store ratings from users for each product.. The table consists of the following fields productid, userid, rating(out of 5)...This table might contain a million rows in the future....
So in order to select the top 5 products I am using the following query::.

SELECT productid, avg( rating ) as avg_rating
from product_ratingstblx245v
GROUP BY productid
ORDER BY avg( rating ) DESC
LIMIT 5

My Question is since I will be showing this result on few pages of my site, would it be better to maintain a seperate table for the average ratings with fields productid,avgrating,totalvotes???

+3  A: 

You don't need the answer to that question yet. You can start with a VIEW, that is the result of executing the above query. If, after performing load tests (e.g. with JMeter), you see that your site runs slow indeed, you can replace the VIEW with a TEMPORARY TABLE (stored in memory). Since the view and the temporary table will look the same from the outside, you will not have to change your business logic.

lbp
A: 

Personally, I do not like the idea of running totals like this but if it does become necessary then I would not store the average, I would store the TOTAL VOTES and TOTAL RATING. That way it's a very simple UPDATE query (add 1 to TOTAL VOTES and add rating to TOTAL RATING). You can then calculate the average on the fly in minimal time.

As for how you might handle this, I would use a trigger as someone already suggested. But only after trying the VIEW thing that someone else suggested.

Joe Philllips
Oh yeah I was thinking of that only because recalculating the average after every addition would be foolish.... I was in a hurry so I just wrote that up...my bad
halocursed
+1  A: 

Tbh, if MySql wasn't able to handle queries on a simple table schema such as yours above for over a million records in (sub)millisecond speeds, I really would wonder why companies use it for LOB applications.

As it is, I'm a MS Sql developer, so I don;t really know that much about MySql's ability. However assuming that its Database Engine is as good as Sql Servers (I've heard good things about My Sql), you don;t need to worry about your performance issue. If you do want to tweak it, then why not cache the results for 10 minutes (or longer) at your application layer? Triggers are generally pure (albeit sometimes neccessary) evil. Sql Servers are designed specfically for the type of query you with to execute, trust in the Sql.

Jaimal Chohan
Thanks I'll take your word for it since u're a MS Sql dev. and go ahead with the above mentioned query....Thanks a Lot
halocursed
+1 for suggestion of caching the results! For any brand of RDBMS you use, the best query performance improvement is to avoid running the query when you don't have to.
Bill Karwin