views:

196

answers:

3

How can I calculate the average of a set of data while smoothing over any points that are outside the "norm". It's been a while since I had to do any real math, but I'm sure I learned this somewhere...

Lets say I have 12 days of sales data on one item: 2,2,2,50,10,15,9,6,2,0,2,1

I would like to calculate the average sales per day without allowing the 4th day (50) to screw up the average too much. Log, Percentile, something like that I think...

+5  A: 

It sounds to me that you're looking for a moving average.

Matt Kellogg
Ah! Perfect, I knew it had a name.
Charlie Brown
A: 

You can also filter by thresholding at some multiple of the standard deviation. This would filter out results that were much farther than expected from the mean (average).

Standard deviation is simply sqrt(sum(your_values - average_value) / number_of_values).

edit: You can also look at weighting the value by it's deviation from the mean. So values that are very large can be weighted as 1 / exp(deviation) and therefore contribute much less the farther from the mean they are.

Ron Warholic
+1  A: 

You'll want to use something like IQR (interquartile range). Basically you break the data into quartiles and then calculate the median from the first and third quartiles. Then you can get your central tendency of the data.

nlucaroni