views:

142

answers:

2

I was wondering about how bits are organized on floats (4 bytes), double (8 bytes) and half floats (2 bytes, used on OpenGL implementation).

Further, how I could convert from one to another?

A: 

Half, Single, Double

Handy-dandy diagrams on those pages. The library should provide means for converting between the various formats.

Anon.
+2  A: 

In essence for each of these formats, you have:

  • 1 sign bit
  • x exponent bits yielding a whole number E
  • y mantissa (or "significand") bits yielding a fractional number M

If the sign bit is 1, the number is negative, else it is positive.

To get the magnitude, you take (1 + M) * 2^(E - k), where k (called the "exponent bias") depends on the format.

It's worth noting that certain combinations of sign, exponent, and mantissa are "special" values, like 0, -inf, +inf, and NaN.

For the specifics (values of x, y, and k) see Wikipedia for single precision (4 bytes), double precision (8 bytes), and half precision (2 bytes).

Note that these are all specified by IEEE 754, so googling that might give you helpful results. :)

Sapph
GPU's may not be fully conformant to IEEE 754 (for example they frequently omit support for denormals)
Spudd86