ansaurus

Question

How should I order these "helpful" scores?

Answer 1

+4 A:

This question is probably better asked on http://stats.stackexchange.com .

I guess you still want to order by increasing of 'helpfulness'.

If you want to know how precise a given number is, the simplest way is to use the square root of the variance of the Binomial distribution with n equal to the total number of responses and p the fraction of responses which were 'helpful'.

Andre Holzner 2010-09-20 06:39:52

+1 for stats.stackexchange.com

Thilo 2010-09-20 06:43:41

Answer 2

+1 A:

A very simple solution would be to ignore everything with less than a cut-off amount of votes, and then sort by percentage.

For example (require at least five votes)

   1.  99.9% (1000 votes)
   2.  74.8%  (400 votes)
   3-5.  waiting for five votes

Thilo 2010-09-20 06:42:43

Answer 3

+3 A:

For each post, generate bounds on how helpful you expect it to be. I prefer to use the Agresti-Coull interval. Pseudocode:

float AgrestiCoullLower(int n, int k) {
  //float conf = 0.05;  // 95% confidence interval
  float kappa = 2.24140273; // In general, kappa = ierfc(conf/2)*sqrt(2)
  float kest=k+kappa^2/2;
  float nest=n+kappa^2;
  float pest=kest/nest;
  float radius=kappa*sqrt(pest*(1-pest)/nest);
  return max(0,pest-radius); // Lower bound
  // Upper bound is min(1,pest+radius)
}

Then take the lower end of the estimate and sort on this. So the 2/2 is (by Agresti-Coull) 95% likely to fall in the 'helpfulness' range 23.7% to 100%, so it sorts below the 999/1000 which has range 99.2% to 100% (since .237 < .992).

Charles 2010-09-20 15:28:47

For ties (especially those at 0), I suggest breaking in favor of largest number of upvotes, then smallest number of downvotes.

Charles 2010-09-20 23:51:40

wow, Charles, this is hard core. very impressive. i'll run it on my examples and see how they sort (after i spend a few minutes educating myself on Agresti-Coull at wikipedia!)

mitchf 2010-09-21 00:25:24

Let me know how it goes. I can give more information and/or references as needed.

Charles 2010-09-21 03:16:34

+1 for this elegant solution (ordering by the lower end of the confidence interval). Just out of curiosity: how does the size of the interval behave for number of upvotes = 0 or = number of answers ? (the plain Binomial variance goes to zero in these cases)

Andre Holzner 2010-09-21 05:54:15

@Andre: Asymptotically, it decreases like 1/n, or rather C/n where C depends on the chosen confidence.

Charles 2010-09-21 16:09:04

ansaurus

tags:

views:

answers:

How should I order these "helpful" scores?

related questions