views:

94

answers:

4

Suppose I conduct a survey of 10 people asking whether to rank a movie as 0 to 4 stars. Allowable answers are 0, 1, 2, 3, and 4.

The mean is 2.0 stars.

How do I calculate the certainty (or uncertainty) about this 2.0 star rating? Ideally, I would like a number between 0 and 1, where 0 represents complete uncertainty and 1 represents complete certainty.

It seems clear that the case where the 10 people choose ( 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 ) would be the most certain, while the case where the 10 people choose ( 0, 0, 0, 0, 0, 4, 4, 4, 4, 4 ) would be the least certain. ( 0, 1, 1, 2, 2, 2, 2, 3, 3, 4 ) would be somewhere in the middle.

+3  A: 

The function you're after here is the standard deviation.

The standard deviations of your three examples are 0 (meaning no deviation), 2.1 (large deviation) and 1.15 (in between).

Andrew Shepherd
Jinx, you owe me a coke. :P
Russell Newquist
A: 

What you want is called the standard deviation.

Russell Newquist
+6  A: 

The standard deviation does not have the properties requested. It is zero when everyone chooses the same answer, and can be as great as sqrt(40/9) = 2.11 when there are five 0s and five 4s.

I suggest you use 1-stdev(x)/sqrt(40/9) which will take value 1 when everyone agrees, and value 0 when there are five 0s and five 4s.

Rob Hyndman
I was thinking something like this as well, but I didn't know if there was a statistical calculation that specifically addresses this type of question.I was expecting to google this and find something obvious relating to calculating the "degree of agreement" in responses to subjective survey questions, but have had no luck.Thanks for your response. I'll try your suggestion and see how it works.
Doug Knesek
A: 

You should consider whether or not the mean value is an appropriate statistic for this kind of information. ie Is a movie rated 2 stars twice as good as one rated 4 stars?

You may be better served by using a percentile measure (such as the median) to represent the central tendency, and a percentile range (such as the IQR) to measure 'certainty'. As in the answers above, certainty would be greatest with a value of 0, as you are really making a measurement of deviation from the central tendency.

Incidentally, a survey of 10 people is too small to perform much in the way of meaningful statistical analysis.

James