views:

74

answers:

3

please ignore this post, I misread algorithm, so the question is not relevant. However, I cannot close post anymore. Please vote to close

I have been using certain algorithm from numerical recipes, which converges to zero via underflow:

// all types are the same floating type
sum = 0
for (i in 0,N)
   sum += abs(V[i]);

my question, how does it happen? how does sum of small positive floating-point numbers converge to underflow/zero?

is there some condition where 0 + f = 0 , f > 0?

algorithm in question is Jacoby, http://www.mpi-hd.mpg.de/astrophysik/HEA/internal/Numerical_Recipes/f11-1.pdf, page 460. It is quite possible I misunderstand how the convergence is achieved, if so, please correct me.

thank you

A: 

I would be very surprised if that was possible using IEEE 754 arithmetics. The point is that IEEE 754 states that intermediate results are infinitely precise and then rounded to the destination data type.

So if you have sum + V[i], that value will always be greater or equal to sum. Rounding down to the next representable number will either produce sum or a number greater than sum.

Of course, there is nothing in the original question that prevents sum from being negative in the first place. In that case the answer would be trivial.

In IEEE 754 arithmetics, there is no number f such that 0 + f = 0 and at the same time f > 0.

Roland Illig
thank you, I added more information to question, including source code link
aaa
+2  A: 

If V is an array of doubles and sum is a float (or single), you can certainly have values that are > 0 but when added to sum produces 0 if they are smaller than the smallest non-zero denormalized value representable in a float.

How do you know sum is actually zero and not just really really close? Are all bits set to zero?

EDIT: after reading the actual application, the underflow to zero remark is probably referring to repeated rotations around various axes to determine the eigenvalues and eigenvectors of a matrix. In that case, the algorithm only works if you can assume that repeated multiplications of very small numbers will clamp or underflow to zero. The actual sum won't underflow itself, however.

MSN
Assuming that `sum` already contains `FLT_MIN` and an arbitrarily small value is added to it, how can the correctly rounded result ever become smaller than `sum`? I'm asking because `sum` will always be nearer to the exact result than `0`.
Roland Illig
@Roland, he doesn't say what sum is initialized to, but I'm going to assume it's 0. And a double can be smaller than FLT_MIN. And converges to zero and being 0 bitwise are two different things. But yes, sum should be monotonically increasing.
MSN
here http://www.mpi-hd.mpg.de/astrophysik/HEA/internal/Numerical_Recipes/f11-1.pdf, page 460. all types are same
aaa
+1  A: 

What are the types you are using? If f is a float and d1 and d2 are doubles, you get this.

double d1 = std::numeric_limits<double>::min();
double d2 = std::numeric_limits<double>::min();
float f = d1 + d2;
if (f == 0.0) std::cout << "yes";
else std::cout << "no";

This produces "yes".

cape1232
Man you SO members are fast. How do newbies like me get any reputation when we can't beat anyone to the punch? ;)
cape1232
@cape1232, we all started out that way. You should see how fast litb responds to esoteric C++ questions. Or Jon Skeet in general.
MSN
@MSN litb is very helpful indeed. He answered many questions for me, many thanks to him
aaa