I need to know if a number compared to a set of numbers is outside of 1 stddev from the mean, etc..
Here is code to calculate a standard deviation for a set of numbers. Once you have that, that gives you an upper and lower bound to compare the number to.
You can avoid making two passes over the data by accumulating the mean and mean-square
cnt = 0
mean = 0
meansqr = 0
loop over array
cnt++
mean += value
meansqr += value*value
mean /= cnt
meansqr /= cnt
and forming
sigma = sqrt(meansqr - mean^2)
A factor of cnt/(cnt-1)
is often appropriate as well.
BTW-- The first pass over the data in Demi and McWafflestix answers are hidden in the calls to Average
. That kind of thing is certainly trivial on a small list, but if the list exceed the size of the cache, or even the working set, this gets to be a bid deal.
Code snippet:
public static double StandardDeviation(List<double> valueList)
{
if (valueList.Count < 2) return 0.0;
double sumOfSquares = 0.0;
double average = valueList.Average(); //.NET 3.0
foreach (double value in valueList)
{
sumOfSquares += Math.Pow((value - average), 2);
}
return Math.Sqrt(sumOfSquares / (valueList.Count - 1));
}
While the sum of squares algorithm works fine most of the time, it can cause big trouble if you are dealing with very large numbers. You basically may end up with a negative variance...
Plus, don't never, ever, ever, compute a^2 as pow(a,2), a * a is almost certainly faster.
By far the best way of computing a standard deviation is Welford's method. My C is very rusty, but it could look something like:
public static double StandardDeviation(List<double> valueList)
{
double M = 0.0;
double S = 0.0;
double tmpM = 0.0;
int k = 1;
foreach (double value in valueList)
{
double tmpM == M;
M += (value - tmpM) / k;
S += (value - tmpM) * (value - M);
k++;
}
return Math.Sqrt(S / (k-1));
}
EDIT: I've updated the code accoridng to Jason's remarks...