What initially seemed like a problem with an easy solution has proven to be quite an interesting challenge.
I have a class which maintains an internally fixed-size, thread-safe collection (by way of using lock
on all insertion and removal operations) and provides various statistical values via its properties.
One example:
public double StandardDeviation {
get {
return Math.Sqrt((Sum2 - ((Sum * Sum) / Count)) / Count);
}
}
Now, I've tested this computation thoroughly, running 10,000 values through the collection and checking the standard deviation on each update. It works fine... in a single-threaded scenario.
A problem arises in the multi-threaded context of our development and production environments, however. It seems that this number is somehow sometimes coming back NaN
before quickly changing back to a real number. Naturally this must be due to a negative value passed to Math.Sqrt
. I can only imagine this happens when, mid-calculation, one of the values used in the calculation is updated by a separate thread.
I could cache the values first:
int C = this.Count;
double S = this.Sum;
double S2 = this.Sum2;
return Math.Sqrt((S2 - (S * S) / C) / C);
But then Sum2
might still be updated, for example, after S = this.Sum
has been set, compromising the calculation once again.
I could put a lock
around all points in the code where these values are updated:
protected void ItemAdded(double item) {
// ...
lock (this.CalculationLock) {
this.Sum += item;
this.Sum2 += (item * item);
}
}
Then if I lock
on this same object when calculating StandardDeviation
, I thought that would, finally, fix the problem. It didn't. The value is still coming in as NaN
on a fleeting, infrequent basis.
Frankly, even if the above solution had worked, it was very messy and did not seem very manageable to me. Is there a standard and/or more straightforward way of achieving thread-safety in calculated values such as this?
EDIT: Turns out here we have an example of a problem where it seemed at first like there was only one possible explanation when, after all, the issue was with something else entirely.
I had been meticulous about implementing thread safety in every way I could without making a huge performance sacrifice if at all possible -- locking on reads and writes to shared values (e.g., Sum
and Count
), caching values locally, and using the same lock object for modifying the collection and updating the shared values... honestly, it all seemed like overkill.
Nothing worked; that nefarious NaN
kept popping up. So I decided to print all the values in the collection to the console whenever StandardDeviation
returned NaN
...
Immediately I noticed that it always seemed to be happening when all the values in the collection were the same.
It's official: I got burned by floating point arithmetic. (All of the values were the same, so the radicand in StandardDeviation
-- i.e., the number whose square root is being taken -- was being evaluated to some extremely small negative number.)