tags:

views:

65

answers:

3

Hi Folks,

numpy.average() has a weights option, but numpy.std() does not. Do folks have suggestions for a workaround?

Thanks! /YGA

+3  A: 

There doesn't appear to be such a function in numpy/scipy yet, but there is a ticket proposing this added functionality. Included there you will find Statitsics.py which implements weighted standard deviations.

unutbu
+2  A: 

Using

alt text

from wikipedia, the following code should do the trick

def wstd(x,w):
    t = w.sum()
    return (((w*x**2).sum()*t-(w*x).sum()**2)/(t**2-(w**2).sum()))**.5

if I didn't make a mistake in implementing it. w is the weights; x is the data. However, you might want to add a check to make sure that the denominator is not 0.

Justin Peel
A warning: the quoted formula is not numerically precise, and can lead to overflow (because the terms of the subtractions can be large, and cancel each other).
EOL
+1  A: 

How about the following short "manual calculation"?

def weighted_avg_and_std(values, weights):
    """
    Returns the weighted average and standard deviation.

    values, weights -- Numpy ndarrays with the same shape.
    """
    average = numpy.average(values, weights=weights)
    variance = numpy.dot(weights, (values-average)**2)/weights.sum()  # Fast and numerically precise
    return (average, math.sqrt(variance))
EOL