ansaurus

Question

Floating point addition: loss-of-precision issues

Answer 1

+6 A:

IEEE provides four rounding modes, (toward -inf, toward +inf, toward 0, tonearest). Toward +inf is what you seem to want. There is no standard control in C90 or C++. C99 added the header <fenv.h> which is also present as an extension in some C90 and C++ implementation. To respect the C99 standard, you'd have to write something like:

#include <fenv.h>
#pragma STDC FENV_ACCESS ON

int old_round_mode = fegetround();
int set_round_ok = fesetround(FE_UPWARD);
assert(set_round_ok == 0);
...
int set_round_ok = fesetround(old_round_mode);
assert(set_round_ok == 0);

It is well known that the algorithm you use is numerically unstable and has precision problem. It is better for precision to do two passes on the data.

AProgrammer 2009-08-10 08:33:45

Uses two passes is really unfortunate due to performance issues (it also makes the API uglier). As far as I can tell, the algorithm should be stable if you just round up - right?

Eamon Nerbonne 2009-08-10 08:47:44

I wonder, would something like "sumOfSquares += sqrVal + sumOfSquares/(1L << 52)" be likely to be stable?

Eamon Nerbonne 2009-08-10 09:00:23

@Eamon, about your first question, I've no time to do a real stability analysis. Especially that I don't do that often enough do be confident in the result. The code in your second comment doesn't seem equivalent at all (did you intend to divide sqrVal instead?, in that case, scaling doesn't change the stability nor the precision).

AProgrammer 2009-08-10 09:12:48

No, I intended sumOfSquares. Motivation: double's have 52 bits of precision, so the 53rd bit is a potential source of error. To ensure that the estimate is only ever to high and not too low, I can simply add that 53rd bit as well. Presumably sqlVal is small enough to include the bit, and then I'm sure that any rounding error is safely below the 1/2^52 threshold.

Eamon Nerbonne 2009-08-11 07:47:33

Answer 2

+6 A:

There's another single-pass algorithm which rearranges the calculation a bit. In pseudocode:

n = 0
mean = 0
M2 = 0

for x in data:
    n = n + 1
    delta = x - mean
    mean = mean + delta/n
    M2 = M2 + delta*(x - mean)  # This expression uses the new value of mean

variance_n = M2/n         # Sample variance
variance = M2/(n - 1)     # Unbiased estimate of population variance

(Source: http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance )

This seems better behaved with respect to the issues you pointed out with the usual algorithm.

Jim Lewis 2009-08-10 08:38:23

i.e., I should have checked wikipedia ;-). Thanks, that looks promising!

Eamon Nerbonne 2009-08-10 09:11:02

...and Wikipedia even has a weighted version, which is what I'm _really_ after, but I thought I'd not muddy the waters unnecessarily.

Eamon Nerbonne 2009-08-10 09:13:35

Answer 3

+2 A:

If you don't worry about the precision, but just about a negative variance, why don't you simply do V(x) = Max(0, E(X^2) - E(X)^2)

erikkallen 2009-08-10 09:18:10

That was my initial workaround, but I hoped to tap stackoverflow's overflowing wisdom for a better one. It's a pragmatic solution - should probably have mentioned it ;-).

Eamon Nerbonne 2009-08-11 07:44:48

ansaurus

tags:

views:

answers:

Floating point addition: loss-of-precision issues

The Long Story

related questions