views:

468

answers:

2

So here's my challenge. I have a spreadsheet that looks like this:

prod_id | pack  | value | durable | feat | ease | grade  | # of ratings
1         75      85      99        90     90     88       1
2         90      95      81        86     87     88       9
3         87      86      80        85     82     84       37
4         92      80      68        67     45     70       5
5         93      81      94        93     90     90       4
6         93      70      60        60     70     70       1

Each product has individual grade criteria (packaging - ease of use), an overall average grade, and number of ratings the product received.

The entire data set I have places 68% of the products within the 80-89 grade range. I need to have the grades recalculated to take into account the number of ratings for the product, so products that fall far below the total average number of ratings are ranked lower (and receive a lower grade). Basically a product with a grade of 84 and 100 ratings should rank higher than a product with a grade of 95 with only 5 ratings.

I hope this makes sense, thanks for any help in advance!

+1  A: 

I can't tell exactly without a calculator, but it looks like

Grade = AVG(pack, value, durable, feat, ease)

If that's the case, then you just have to define "fall far below the total average number of ratings". I'll weight against the standard deviation from the mean - which may or may not be a decent algorithm (I'm not statistician). But, this means any rating that's exactly the mean = 1, and you get +/- from there.

WeightedGrade = Grade * ABS((Rating - AVG(H:H)) / STDEV(H:H))
Mark Brackett
A: 

What you need is a meaningful algorithm for weighting. You can choose anything that makes sense to you, but the first thing to try, based on your requirements, is to multiply the raw grade by a weighting factor. Calculate that as the ratio of the # of ratings divided by the total # of ratings gives this for an answer:

prod id raw grade # ratings weight         weighted grade
1         88       1           0.01754386   1.543859649 
2         88       9           0.157894737   13.89473684
3         84       37       0.649122807     54.52631579
4         70       5           0.087719298   6.140350877
5         90       4           0.070175439   6.315789474
6         70       1           0.01754386   1.228070175
                    57

Not sure if this makes sense for your problem, but it does meet your requirements. Maybe you can normalize the weighted grades so prod id # 3 is 100 and scale the rest from that.

Have a look at "Collective Intelligence" for some other ideas.

duffymo