views:

441

answers:

6

I am interested to learn about the binary format for a single or a double type used by C++ on Intel based systems.

I have avoided the use of floating point numbers in cases where the data needs to potentially be read or written by another system (i.e. files or networking). I do realise that I could use fixed point numbers instead, and that fixed point is more accurate, but I am interested to learn about the floating point format.

+7  A: 

Wikipedia has a reasonable summary - see http://en.wikipedia.org/wiki/IEEE_754.

Burt if you want to transfer numbers betwen systems you should avoid doing it in binary format. Either use middleware like CORBA (only joking, folks), Tibco etc. or fall back on that old favourite, textual representation.

anon
+4  A: 

This should get you started : http://docs.sun.com/source/806-3568/ncg_goldberg.html. (:

Bastien Léonard
Or perhaps finish him off?
anon
+2  A: 

Intel's representation is IEEE 754 compliant. You can find the details at http://download.intel.com/technology/itj/q41999/pdf/ia64fpbf.pdf .

Peter Stuer
A: 

Note that decimal floating-point constants may convert to different floating-point binary values on different systems (even with different compilers on the same system). The difference would be slight -- maybe only as large as 2^-54 for a double -- but is a difference nonetheless.

Use hexadecimal constants if you want to guarantee the same floating-point binary value on any platform.

Rick Regan
+1  A: 

Floating-point format is determined by the processor, not the language or compiler. These days almost all processors (including all Intel desktop machines) either have no floating-point unit or have one that complies with IEEE 754. You get two or three different sizes (Intel with SSE offers 32, 64, and 80 bits) and each one has a sign bit, an exponent, and a significand. The number represented is usually given by this formula:

sign * (2**(E-k)) * (1 + S / (2**k'))

where k' is the number of bits in the significand and k is a constant around the middle range of exponents. There are special representations for zero (plus and minus zero) as well as infinities and other "not a number" (NaN) values.

There are definite quirks; for example, the fraction 1/10 cannot be represented exactly as a binary IEEE standard floating-point number. For this reason the IEEE standard also provides for a decimal representation, but this is used primarily by handheld calculators and not by general-purpose computers.

Recommended reading: David Golberg's What Every Computer Scientist Should Know About Floating-Point Arithmetic

Norman Ramsey
It's worth pointing out that the fraction 1/3 cannot be represented as either a binary or decimal floating-point value. Most people think decimal representations are somehow "magical", but that's just because we're all used to thinking in base-10 to begin with.
Tom
This answer really satisfied my curiosity, and the article found at the link was facinating.
+2  A: 

As other posters have noted, there is plenty of information about on the IEEE format used by every modern processor, but that is not where your problems will arise.

You can rely on any modern system using IEEE format, but you will need to watch for byte ordering. Look up "endianness" on Wikipedia (or somewhere else). Intel systems are little-endian, a lot of RISC processors are big-endian. Swapping between the two is trivial, but you need to know what type you have.

Traditionally, people use big-endian formats for transmission. Sometimes people include a header indicating the byte order they are using.

If you want absolute portability, the simplest thing is to use a text representation. However that can get pretty verbose for floating point numbers if you want to capture the full precision. 0.1234567890123456e+123.

Michael J