views:

114

answers:

2

I'm trying to wrap my head around the proper design to calculate an average for multiple items, in my case beers. Users of the website can review various beers and all beers are given a rating (avg of all reviews for that beer) based on those reviews. Each beer review has 5 criteria that it's rated on, and those criteria are weighted and then calculated into an overall rating for that particular review (by that user).

Here are some of the relevant models as they currently stand. My current thinking is that all beer reviews will be in their own table like you see below.

class Beer(models.Model):
    name = models.CharField(max_length=200)
    brewer = models.ForeignKey(Brewery)
    style = models.ForeignKey(Style)
    .....

class Beerrating(models.Model):
    thebeer = models.ForeignKey(Beer)
    theuser = models.ForeignKey(User)
    beerstyle = models.ForeignKey(Style)
    criteria1 = models.IntegerField
    ...
    criteria5 = models.IntegerField
    overallrating = models.DecimalField

My real question is how do I calculate the overall beer average based on all the reviews for that beer? Do I keep a running tally in the Beer model (e.g. # reviews and total points; which gets updated after every review) or do I just calculate the avg on the fly? Is my current db design way off the mark?

I'll also be calculating a top beer list (100 highest rated beers), so that's another calculation I'll be doing with the ratings.

Any help is much appreciated. This is my first web app so please forgive my noob-ness. I haven't chosen a DB yet, so if MYSQL or PostgresSQL is better in some way over the other, please provide your preference and perhaps why if you have time. I'll be choosing between those two DB's. I'm also using Django. Thank You.

+1  A: 

As long as you're using Django version 1.1, you can use the new aggregation features to calculate the average whenever you need it.

Something like:

from django.db.models import Avg
beers_with_ratings = Beer.objects.all().annotate(avg_rating=Avg('beer__overallrating'))

Now each Beer object will have an avg_rating property which is the average of the overallrating fields for each of its associated Ratings.

Then to get the top 100:

beers_with_ratings.order_by('avg_rating')[:100]

As regards database choice, either is perfectly fine for this sort of thing. Aggregation is a basic feature of relational databases, and both Postgres and Mysql can do it with no problem.

Daniel Roseman
Thanks for the information! I knew 1.1 recently came out but I hadn't played with it yet. Looks like I have a good reason now.
kfordham281
A: 

You might want to have a look at Django ratings module. It's very nicely structured and provides a powerful ratings system. And not overly complicated at the same time (although if this is your first web-app it might look slightly intimidating). You won't have to deal with calculating averages etc. directly.

Edit: To be a bit more helpful

If you use django-ratings, your models.py would probably look something like this:

class Beer(models.Model):
    name = models.CharField(max_length=200)
    brewer = models.ForeignKey(Brewery)
    style = models.ForeignKey(Style)
    .....
    criteria1 = RatingField(range=5) # possible rating values, 1-5
    ...
    criteria5 = RatingField(range=5)

No need for Beerrating model. Instead all the ratings information will be stored in Vote + Score models of django-ratings.

Béres Botond
Are there any advantages to using this pluggable app vs. building it out "manually"?
kfordham281