Hi Folks,
numpy.average() has a weights option, but numpy.std() does not. Do folks have suggestions for a workaround?
Thanks! /YGA
Hi Folks,
numpy.average() has a weights option, but numpy.std() does not. Do folks have suggestions for a workaround?
Thanks! /YGA
There doesn't appear to be such a function in numpy/scipy yet, but there is a ticket proposing this added functionality. Included there you will find Statitsics.py which implements weighted standard deviations.
Using
from wikipedia, the following code should do the trick
def wstd(x,w):
t = w.sum()
return (((w*x**2).sum()*t-(w*x).sum()**2)/(t**2-(w**2).sum()))**.5
if I didn't make a mistake in implementing it. w is the weights; x is the data. However, you might want to add a check to make sure that the denominator is not 0.
How about the following short "manual calculation"?
def weighted_avg_and_std(values, weights):
"""
Returns the weighted average and standard deviation.
values, weights -- Numpy ndarrays with the same shape.
"""
average = numpy.average(values, weights=weights)
variance = numpy.dot(weights, (values-average)**2)/weights.sum() # Fast and numerically precise
return (average, math.sqrt(variance))