What is the biggest "no-floating" integer that can be stored in a double C type (IEEE) without loosing precision ?
1.7976931348623157 × 10^308
http://en.wikipedia.org/wiki/Double%5Fprecision%5Ffloating-point%5Fformat
Wikipedia has this to say in the same context with a link to IEEE 754:
On a typical computer system, a 'double precision' (64-bit) binary floating-point number has a coefficient of 53 bits (one of which is implied), an exponent of 11 bits, and one sign bit.
2^53 is just over 9 * 10^15.
DECIMAL_DIG
from <float.h>
should give at least a reasonable approximation of that. Since that deals with decimal digits, and it's really stored in binary, you can probably store something a little larger without losing precision, but exactly how much is hard to say. I suppose you should be able to figure it out from FLT_RADIX
and DBL_MANT_DIG
, but I'm not sure I'd completely trust the result.
The biggest integer that can be stored in a double without losing precision is the same as the largest possible value of a double. That is, DBL_MAX or approximately 1.8 x 10^308 (if your double is an IEEE 64 bit double). It's an integer. It's represented precisely. What more do you want?
Go on, ask me what the biggest integer is, such that it and all smaller integers can be stored in IEEE 64 bit doubles without losing precision. An IEEE 64 bit double has 52 bits of mantissa, so I think it's 2^53:
- 2^53 + 1 can't be stored, because the 1 at the start and the 1 at the end have too many zeros in between.
- Anything less than 2^53 can be stored, with 52 bits explicitly stored in the mantissa, and then the exponent in effect giving you another one.
- 2^53 obviously can be stored, since it's a small power of 2.
Or another way of looking at it: once the bias has been taken off the exponent, and ignoring the sign bit as irrelevant to the question, the value stored by a double is a power of 2, plus a 52-bit integer multiplied by 2^(exponent - 52). So with exponent 52 you can store all values from 2^52 through to 2^53-1. Then with exponent 53, the next number you can store after 2^53 is 2^53 + 1 * 2^(53 - 52). So loss of precision first occurs with 2^53+1.
You need to look at the size of the mantissa. An IEEE 754 64 bit floating point number (which has 52 bits, plus 1 implied) can exactly represent integers with an absolute value of less than or equal to 2^53.
9007199254740992 (that's 9,007,199,254,740,992) with no guarantees :)
Program
#include <math.h>
#include <stdio.h>
int main(void) {
double dbl = 0; /* I started with 9007199254000000, a little less than 2^53 */
while (dbl + 1 != dbl) dbl++;
printf("%.0f\n", dbl - 1);
printf("%.0f\n", dbl);
printf("%.0f\n", dbl + 1);
return 0;
}
Result
9007199254740991 9007199254740992 9007199254740992