views:

102

answers:

2

We have 2's complement for integers that allows us to perform operations without worrying about the sign. That is a big help at the implementation level.

Similarly we have so many floating point operations and yet we rely on sign and magnitude. What is the reason?

Why can't a 2's complement like system work for floats?

+4  A: 

For addition of floats there is a lot more to do than in the integer case - you need to shift one value to make the exponents match. Any additional cost for doing sign+magnitude addition is insignificant by comparison.

Also note that the separate sign bit is much better for multiplication - you just need a single unsigned multiplier which handles all cases with the sign bits being taken care of separately. Compare this with two's complement multiplication, where you either have to normalise the signs or have support for signed/unsigned multiplies.

Paul R
@Paul R: so, floats are optimized for multiplications, while 2's complement integers are optimized for additions... am I correct?
Amoeba
@cambr: yes, that's a reasonable way to look at it, although there are many other issues in the design of floating point formats and associated hardware implementations
Paul R
A: 

If you dig into the standard representation of floating point numbers it's actually an integer-like mantissa and an exponent. I say integer-like since, when normalized, the first bit is always a '1' - you know that the product of two numbers will always start with 0 or 1 (and in the former case you need to left-shift the results by one and adjust the exponent accordingly, at the loss of a single bit of precision). Multiplication and division are well-behaved as long as you don't overflow the number of bits you can hold in the exponent.

Addition and subtraction, on the other hand, require changing the representation from the normalized form to one where the exponents match. This is why you can get seemingly bizarre results if you add two numbers that are wildly different in magnitude or you subtract two numbers that are nearly identical. This is why the intermediate results usually have far more digits of precision than the standard 4- and 8-byte floats and reals.

Could you use twos-complement notation here? Maybe... but you couldn't use the same rules for manipulating the representation.

I think it comes down to trusting the generation(s) of experts who have looked at the problem. If hundreds of PhDs and principal engineers think the current representation is the best approach then I have to believe them.

bgiles