tags:

views:

144

answers:

5
+1  Q: 

float overflow?

The following code seems to always generate wrong result. I have tested it on gcc and windows visual studio. Is it because of float overflow or something else? Thanks in advance:)

#include <stdio.h>
#define N 51200000
int main()
{
 float f = 0.0f;
 for(int i = 0; i < N; i++)
  f += 1.0f;
 fprintf(stdout, "%f\n", f);
 return 0;
}
+12  A: 

float only has 23 bits of precision. 512000000 requires 26. Simply put, you do not have the precision required for a correct answer.

Ignacio Vazquez-Abrams
Actually, you do not need 26 bits for 512000000. It's 15625 * 2^15. Therefore you need 14 bits of precision and 4 bits of exponent. The actual problem is that 511999999 needs 26 bits of precision.
MSalters
*Getting* to 512000000 then. Either way it's a problem.
Ignacio Vazquez-Abrams
more interesting, the two print functions output the same result:#define N 51200000f = float(N);fprintf(stdout, "%.7e\n", f);for(int i = 0; i < N; i++){ f -= 1.0f;}fprintf(stdout, "%.7e\n", f);
GBY
A: 

The precision of float is only 7 digits. Adding number 1 to a float larger than 2^24 gives a wrong result. With using double types instead of float you will get a correct result.

zoli2k
+1  A: 

For more information on precision of data types in C please refer this.

Your code is expected to give abnormal behaviour when you exceed the defined precision.

Praveen S
+1  A: 

Unreliable things to do with floating point arithmetic include adding two numbers together when they are very different in magnitude, and subtracting them when they are similar in magnitude. The first is what you are doing here; 1 << 51200000. The CPU normalises one of the numbers so they both have the same exponent; that will shift the actual value (1) off the end of the available precision when the other operand is large, so by the time you are part way through the calculation, one has become (approximately) equal to zero.

Brian Hooper
That is the very problem that I've encountered. It is a little tricky to sum up many small real numbers. Thanks:)
GBY
+1  A: 

Your problem is the unit of least precision. Short: Big float values cannot be incremented with small values as they will be rounded to the next valid float. While 1.0 is enough to increment small values the minimal increment for 16777216 seems to be 2.0 (checked for java Math.ulp, but should work for c++ too).

Boost has some functions for this.

josefx