views:

510

answers:

8

When I convert an unsigned 8-bit int to string then I know the result will always be at most 3 chars (for 255) and for an signed 8-bit int we need 4 chars for e.g. "-128".

Now what I'm actually wondering is the same thing for floating-point values. What is the maximum number of chars required to represent any "double" or "float" value as a string?

Assume a regular C/C++ double (IEEE 754) and normal decimal expansion (i.e. no %e printf-formatting).

I'm not even sure if the really small number (i.e. 0.234234) will be longer than the really huge numbers (doubles representing integers)?

+1  A: 

That depends on what you mean by "represent". Do you just want round-tripping, or the decimal representation of the exact value? Does it have to be human readable, or could you just treat the 4/8 bytes as opaque binary data and base64 or hex encode them?

Jon Skeet
Yes I mean roundtripping. Imagine all of the real numbers that are _exactly_ representable as a "double". How long (as in base-10 decimal expansion string) is the longest one of those?
martin
+2  A: 

Depends on what you mean by "represent". Decimal fraction don't have exact floating-point representations. When you convert decimal fraction -> binary fraction -> decimal, you do not have exact decimal representations and will have noise bits at the end of the binary representation.

The question didn't involve starting from decimal, but all source code (and must user input) is decimal, and involves the possible truncation issue. What does "exact" mean under these circumstances?

Basically, it depends on your floating point representation.

If you have 48 bits of mantissa, this takes about 16 decimal digits. The exponent might be the remaining 14 bits (about 5 decimal digits).

The rule of thumb is that the number of bits is about 3x the number of decimal digits.

S.Lott
Actually they can't have infinite digits because any binary fraction can be expressed precisely in decimal notation.
Vilx-
I mean represent as an exact base 10 decimal expansion (e.g. no %e stuff). And I'm assuming we're dealing with standard C/C++ double using IEEE 754.
martin
Of course. I mean, in the memory the double is stored in a binary notation like 1101010.101000110101. The number of digits after the binary point is pretty much finite and thus can be represented precisely in decimal. 1/2 is represented as 0.5; 1/4 is represented as 0.25; 1/8 is represented as 0.125; etc. Am I missing something?
Vilx-
The problem is the other way around. Every decimal fraction does not have an exact binary representation. So, if your "original" number was a decimal number, the float->decimal will be wrong to start with. The number of decimal digits doesn't matter if what you're comparing with was decimal.
S.Lott
Of course decimal fractions cannot be represented precisely in binary fractions. I never said otherwise. But binary fractions can be represented precisely in decimal fractions, so IMHO "Floating-point values do not have exact decimal representations and can have an infinite number of repeating decimal digits." isn't correct.
Vilx-
+4  A: 

http://en.wikipedia.org/wiki/IEEE_754-1985

The longest representable double:

-2.2250738585072020E−308

has 24 chars.

Veton
Cool, can you explain why this is the longest one? Why can't for example a really small 0.033211233457645...234234 become longer?
martin
If you don't want the scientific notation that would give you 308 more chars then...
Vilx-
Becouse mantissa of double, following IEEE 754-1985, can represent numbers with maximum accuracy of 17 digits after point. Add to it both minuses for mantissa and period, point, e-char and 3 digits of period (8 bit), and you will get exact 24 chars.
Veton
`0.00000000000000000 ... approx 308 ... 00002225073858507202` ==> approx 326 chars :)
pmg
the op says explicitly not in scientific notation.
fortran
+1  A: 

You can control the number of digits in the string representation when you convert the float/double to a string by setting the precision. The maximum number of digits would then be equal to the string representation of std::numeric_limits<double>::max() at the precision you specify.

#include <iostream>
#include <limits>
#include <sstream>
#include <iomanip>

int main()
{
 double x = std::numeric_limits<double>::max();

 std::stringstream ss;
 ss << std::setprecision(10) << std::fixed << x;

 std::string double_as_string = ss.str();
 std::cout << double_as_string.length() << std::endl;
}

So, the largest number of digits in a double with a precision of 10 is 320 digits.

Charles Salvia
A: 

Perhaps you can approach the problem from another angle? Just allocate 1024 bytes for your string and feel secure? :)

Vilx-
This made me giggle. And then I shed a tear, because it's so true.
Frerich Raabe
As long as this is just one string, this is what I would do. If I get a million of them, well, then I'd start to think about shaving the memory footprint (on the other hand - if I need to work with a million of them I will have to have plenty of RAM anyway and a few megabytes here or there won't matter much).
Vilx-
+5  A: 

The standard header <float.h> in C, or <cfloat> in C++, contains several constants to do with the range and other metrics of the floating point types. One of these is DBL_MAX_10_EXP, the largest power-of-10 exponent needed to represent all double values. Since 1eN needs N+1 digits to represent, and there might be a negative sign as well, then the answer is

int max_digits = DBL_MAX_10_EXP + 2;

This assumes that the exponent is larger than the number of digits needed to represent the largest possible mantissa value; otherwise, there will also be a decimal point followed by more digits.

CORRECTION

The longest number is actually the smallest representable negative number: it needs enough digits to cover both the exponent and the mantissa. This value is -pow(2, DBL_MIN_EXP - DBL_MANT_DIG), where DBL_MIN_EXP is negative. It's fairly easy to see (and prove by induction) that -pow(2,-N) needs 3+N characters for a non-scientific decimal representation ("-0.", followed by N digits). So the answer is

int max_digits = 3 + DBL_MANT_DIG - DBL_MIN_EXP

For a 64-bit IEEE double, we have

DBL_MANT_DIG = 53
DBL_MIN_EXP = -1023
max_digits = 3 + 53 - (-1023) = 1079
Mike Seymour
Sorry, this is wrong - the longest numbers will be very small ones, not very large ones, and it's more complicated to work out their length. It should be possible to work it out from DBL_MIN_EXP and DBL_MANT_DIG; I'll update the answer if I can work it out.
Mike Seymour
I've corrected it now.
Mike Seymour
Thanks for the great answer Mike. In case someone needs clarification: Basically, floating-point has a special rule that says if all bits in the exponent are zero then the number is called "denormalized" and the implicit leading "1." (which is otherwise always prefixed onto the fractional part) is not included. This means that by setting the exponent bits to all zero you can use the fractional part to get another 15-16 digits onto the length of the base10 decimal expansion (i.e. like 0.000...0001*2^-1023 where -1023 is the exponent which is encoded as all zeros due to the exponent bias).
martin
The fractional part (for the denormalized case) has an implicit "0." so you can still only count 52 decimals digits for that and not 53. Thus the answer is 1023 + 52 + 3 = 1078 digits. I wonder why Java comes up with 1077 in Fred's example below though?
martin
+2  A: 

You can use snprintf() to check how many chars you need. snprintf() returns the number of chars needed to print whatever is passed to it.

/* NOT TESTED */
#include <stdio.h>
#include <stdlib.h>
int main(void) {
    char dummy[1];
    double value = 42.000042; /* or anything else */
    int siz;
    char *representation;
    siz = snprintf(dummy, sizeof dummy, "%f", value);
    printf("exact length needed to represent 'value' "
           "(without the '\\0' terminator) is %d.\n", siz);
    representation = malloc(siz + 1);
    if (representation) {
        sprintf(representation, "%f", value);
        /* use `representation` */
        free(representation);
    } else {
        /* no memory */
    }
    return 0;
}

Note: snprintf() is a C99 function. If a C89 compiler provides it as an extension, it may not do what the above program expects.

Edit: Changed the link to snprintf() to one that actually describes the functionality imposed by the C99 Standard; the description in the original link is wrong.

pmg
A: 

1024 is not enough, the smallest negative double value has 1077 decimal digits. Here is some Java code.

double x = Double.longBitsToDouble(0x8000000000000001L);
BigDecimal bd = new BigDecimal(x);
String s = bd.toPlainString();
System.out.println(s.length());
System.out.println(s);

Here is the output of the program.

1077
-0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000004940656458412465441765687928682213723650598026143247644255856825006755072702087518652998363616359923797965646954457177309266567103559397963987747960107818781263007131903114045278458171678489821036887186360569987307230500063874091535649843873124733972731696151400317153853980741262385655911710266585566867681870395603106249319452715914924553293054565444011274801297099995419319894090804165633245247571478690147267801593552386115501348035264934720193790268107107491703332226844753335720832431936092382893458368060106011506169809753078342277318329247904982524730776375927247874656084778203734469699533647017972677717585125660551199131504891101451037862738167250955837389733598993664809941164205702637090279242767544565229087538682506419718265533447265625
Fred
// prints one longer than 1077:import java.math.BigDecimal;public class Program { public static void print_bigdecimal(String name, BigDecimal bd) { String s = bd.toPlainString(); System.out.println ("NUM " + name + ": " + s + " (" + s.length() + " chars)"); } public static void main(String[] args) { print_bigdecimal("-2^-1075==", new BigDecimal(-1).divide (new BigDecimal(2).pow(1075))); print_bigdecimal("0x80...01L", new BigDecimal(Double.longBitsToDouble(0x8000000000000001L))); }}
martin
2^-1075 is valid because it's 2^-52 + 2^-1023
martin