views:

625

answers:

7

Do the underlying bits just get "reinterpreted" as a floating point value? Or is there a run-time conversion to produce the nearest floating point value?

Is endianness a factor on any platforms (i.e., endianness of floats differs from ints)?

How do different width types behave (e.g., int to float vs. int to double)?

What does the language standard guarantee about the safety of such casts/conversions? By cast, I mean a static_cast or C-style cast.

What about the inverse float to int conversion (or double to int)? If a float holds a small magnitude value (e.g., 2), does the bit pattern have the same meaning when interpreted as an int?

+2  A: 

When you convert an integer to a float, you are not liable to loose any precision unless you are dealing with extremely large integers.

When you convert a float to an int you are essentially performing the floor() operation. so it just drops the bits after the decimal

For more information on floating point read: http://www.lahey.com/float.htm

The IEEE single-precision format has 24 bits of mantissa, 8 bits of exponent, and a sign bit. The internal floating-point registers in Intel microprocessors such as the Pentium have 64 bits of mantissa, 15 bits of exponent and a sign bit. This allows intermediate calculations to be performed with much less loss of precision than many other implementations. The down side of this is that, depending upon how intermediate values are kept in registers, calculations that look the same can give different results.

So if your integer uses more than 24 bits (excluding the hidden leading bit), then you are likely to loose some precision in conversion.

zipcodeman
I think your first sentence was intended to be "...an integer to a float..."
Daniel Daranas
Thanks, I fixed it.
zipcodeman
@Stingray: Yes, a truncation: "The conversion truncates;that is, the fractional part is discarded."
Georg Fritzsche
The conversion from a float to an int is done in the CPU, but it behaves as if you had floored it.
zipcodeman
25 bits. There is a "hidden leading one" in the encoding. An IEEE float can store any integer in the (exlusive) range (-2^25,+2^25)
Andy Ross
Yes, but I thought I would leave that out for a beginner. It's a pretty complicated topic.
zipcodeman
@STingRaySC: Yes, it is a conversion, not a reinterpretation of bits. You should read What Every Computer Scientist Should Know About Floating-Point Arithmetic (http://docs.sun.com/source/806-3568/ncg_goldberg.html). If you understand how integers are represented, combined with that article you'll see what it's not possible to just reinterpret the bits; they actually have to be altered to end up with a bona fide integer.
Jason
+3  A: 

For reference, this is what ISO-IEC 14882-2003 says

4.9 Floating-integral conversions

An rvalue of a floating point type can be converted to an rvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type. [Note:If the destination type is `bool, see 4.12. ]

An rvalue of an integer type or of an enumeration type can be converted to an rvalue of a floating point type. The result is exact if possible. Otherwise, it is an implementation-defined choice of either the next lower or higher representable value. [Note:loss of precision occurs if the integral value cannot be represented exactly as a value of the floating type. ] If the source type is bool, the value falseis converted to zero and the value true is converted to one.

Reference: What Every Computer Scientist Should Know About Floating-Point Arithmetic

Other highly valuable references on the subject of fast float to int conversions:

Have a good read!

Gregory Pakosz
yes it implies a performance impact, I'm editing the question with more references of interest
Gregory Pakosz
check the intel manuals for example for the floating point assembly instructions. For easier finding - create a small demo application in C++ and open it in a disassembler. Then look up the instructions. There you'll find all the info you need (at least for intel cpus).
Tobias Langner
+1  A: 

Reinterpreted? The term "reinterpretation" usually refers to raw memory reinterpretation. It is, of course, impossible to meaningfully reinterpret an integer value as a floating-point value (and vice versa) since their physical representations are generally completely different.

When you cast the types, a run-time conversion is being performed (as opposed to reinterpretation). The conversion is normally not just conceptual, it requires an actual run-time effort, because of the difference in physical representation. There are no language-defined relationships between the bit patterns of source and target values. Endianness plays no role in it either.

When you convert an integer value to a floating-point type, the original value is converted exactly if it can be represented exactly by the target type. Otherwise, the value will be changed by the conversion process.

When you convert a floating-point value to integer type, the fractional part is simply discarded (i.e. not the nearest value is taken, but the number is rounded towards zero). If the result does not fit into the target integer type, the behavior is undefined.

Note also, that floating-point to integer conversions (and the reverse) are standard conversions and formally require no explicit cast whatsoever. People might sometimes use an explicit cast to suppress compiler warnings.

AndreyT
+3  A: 

There are normally run-time conversions, as the bit representations are not generally compatible (with the exception that binary 0 is normally both 0 and 0.0). The C and C++ standards deal only with value, not representation, and specify generally reasonable behavior. Remember that a large int value will not normally be exactly representable in a float, and a large float value cannot be represented by an int.

Therefore:

All conversions are by value, not bit patterns. Don't worry about the bit patterns.

Don't worry about endianness, either, since that's a matter of bitwise representation.

Converting int to float can lose precision if the integer value is large in absolute value; it is less likely to with double, since double is more precise, and can represent many more exact numbers. (The details depend on what representations the system is actually using.)

The language definitions say nothing about bit patterns.

Converting from float to int is also a matter of values, not bit patterns. An exact floating-point 2.0 will convert to an integral 2 because that's how the implementation is set up, not because of bit patterns.

David Thornley
+1 for "generally reasonable behavior". The only thing you normally have to remember is that float to int truncates instead of rounding (unlike .NET) - otherwise it's straightforward. (Issues of lost precision can be pretty much ignored if you stick to 'double' and 'long'...)
AAT
A: 

If you cast the value itself, it will get converted (so in a float -> int conversion 3.14 becomes 3).

But if you cast the pointer, then you will actually 'reinterpret' the underlying bits. So if you do something like this:

double d = 3.14;
int x = *reinterpret_cast<int *>(&d);

x will have a 'random' value that is based on the representation of floating point.

R Samuel Klatchko
A: 

Converting FP to integral type is nontrivial and not even completely defined.

Typically your FPU implements a hardware instruction to convert from IEEE format to int. That instruction might take parameters (implemented in hardware) controlling rounding. Your ABI probably specifies round-to-nearest-even. If you're on X86+SSE, it's "probably not too slow," but I can't find a reference with one Google search.

As with anything FP, there are corner cases. It would be nice if infinity were mapped to (TYPE)_MAX but that is alas not typically the case — the result of int x = INFINITY; is undefined.

Potatoswatter
+1  A: 

Do the underlying bits just get "reinterpreted" as a floating point value?

No, the value is converted according to the rules in the standard.

is there a run-time conversion to produce the nearest floating point value?

Yes there's a run-time conversion.

For floating point -> integer, the value is truncated, provided that the source value is in range of the integer type. If it is not, behaviour is undefined. At least I think that it's the source value, not the result, that matters. I'd have to look it up to be sure. The boundary case if the target type is char, say, would be CHAR_MAX + 0.5. I think it's undefined to cast that to char, but as I say I'm not certain.

For integer -> floating point, the result is the exact same value if possible, or else is one of the two floating point values either side of the integer value. Not necessarily the nearer of the two.

Is endianness a factor on any platforms (i.e., endianness of floats differs from ints)?

No, never. The conversions are defined in terms of values, not storage representations.

How do different width types behave (e.g., int to float vs. int to double)?

All that matters is the ranges and precisions of the types. Assuming 32 bit ints and IEEE 32 bit floats, it's possible for an int->float conversion to be imprecise. Assuming also 64 bit IEEE doubles, it is not possible for an int->double conversion to be imprecise, because all int values can be exactly represented as a double.

What does the language standard guarantee about the safety of such casts/conversions? By cast, I mean a static_cast or C-style cast.

As indicated above, it's safe except in the case where a floating point value is converted to an integer type, and the value is outside the range of the destination type.

If a float holds a small magnitude value (e.g., 2), does the bit pattern have the same meaning when interpreted as an int?

No, it does not. The IEEE 32 bit representation of 2 is 0x40000000.

Steve Jessop