views:

533

answers:

4

I want to multiply two numbers, and detect if there was an overflow. What is the simplest way to do that?

+1  A: 

Check if one is less than a maximum value divided by the other. (All values are taken as absolute).

2's complementness hardly has anything to do with it, since the multiplication overflows if x*(2n - x)>2M, which is equal to (x*2n - x2)>2M, or x2 < (x*2n - 2M), so you'll have to compare overflowing numbers anyway (x2 may overflow, while result may not).

Pavel Shved
2's complements make a difference if one factor is negative and the other positive, because the result may be MIN_VALUE, whose absolute value is one more than MAX_VALUE. So you need separate comparisons for each sign combination. And I don't see where your example of `x*(2**n-x)>2**M` comes from.
Christian Semrau
@Christian, my example comes from attempts to group all stuff that can be calculated with bit-sifts in one part of inequality, and everything else to the other.
Pavel Shved
+1  A: 

Alternatives to Pavel Shved's solution ...

If your language of choice is assembler, then you should be able to check the overflow flag. If not, you could write a custom assembler routine that sets a variable if the overflow flag was set.

If this is not acceptable, you can find the most signficant set bit of both values (absolutes). If the sum exceeds the number of bits in the integer (or unsigned) then you will have an overflow if they are multiplied together.

Hope this helps.

Sparky
The last one doesn't seem right. Given unsigned numbers, 4 is 100 (base 2) or 3 bits, but 4 * 4 does not overflow a 5 bit register despite the sum of the bits being 6. The converse is true, though - if the sum of the bits is less than n, it can't overflow.
Phil
Adding bit indices can only give a clue, because, as @Phil noted, `100_2 * 100_2 = 10000_2` which does not overflow a 5 bit value, but `111_2 * 111_2 = 110001_2` which does overflow.
Christian Semrau
You're absolutely right--the best the bit indices can offer is a clue. I should have been more careful and attentive before offering it as a solution, which it is not.
Sparky
+1  A: 

If your number are not from the largest integral data type, then you might just cast them up, multiply and compare with the maximum of the number's original type. E.g. in Java, when multiplying two int, you can cast them to long and compare the result to Integer.MAX_VALUE or Integer.MIN_VALUE (depending on sign combination), before casting the result down to int.

If the type already is the largest, then check if one is less than the maximum value divided by the other. But do not take the absolute value! Instead you need separate comparison logic for each of the sign combinations neg*neg, pos*pos and pos*neg (neg*pos can obviously be reduced to pos*neg, and pos*pos might be reduced to neg*neg). First test for 0 arguments to allow safe divisions.

For actual code, see the Java source of MathUtils class of the commons-math project. Look for public static long mulAndCheck(long a, long b). The case for positive a and b is

// check for positive overflow with positive a, positive b
if (a <= Long.MAX_VALUE / b) {
    ret = a * b;
} else {
    throw new ArithmeticException(msg);
}
Christian Semrau
"you might just cast them up" - assuming each integer type is defined to be at least twice the width of the previous one. Not guaranteed in (for example) C or C++, but usually true.
Steve Jessop
Instead of dealing with the various sign combinations, would it not be sufficient to test that the operation is reversible (i.e., `(a * b) / b == a` after testing that `b != 0`)?
jamesdlin
@james In C/C++, you would not be able to do this, because the result of an overflow is not defined and depends on the hardware (it may wrap around or throw an exception), so you cannot portably(!) detect the overflow after the multiplication. In Java, the result is defined, and on first sight, your suggestion should work. But, unfortunately, it does not work for `a=Long.MIN_VALUE, b=-1L`
Christian Semrau
+1  A: 

Multiplying two 32 bit numbers results in a 64 bit answer, two 8s give a 16, etc. binary multiplication is simply shifting and adding. so if you had say two 32 bit operands and bit 17 set in operand A and any of the bits above 15 or 16 set in operand b you will overflow a 32 bit result. bit 17 shifted left 16 is bit 33 added to a 32.

So the question again is what are the size of your inputs and the size of your result, if the result is the same size then you have to find the most significant 1 of both operands add those bit locations if that result is bigger than your results space you will overflow.

EDIT

Yes multiplying two 3 bit numbers will result in either a 5 bit number or 6 bit number if there is a carry in the add. Likewise a 2 bit and 5 bit can result in 6 or 7 bits, etc. If the reason for this question posters question is to see if you have space in your result variable for an answer then this solution will work and is relatively fast for most languages on most processors. It can be significantly faster on some and significantly slower on others. It is generically fast (depending on how it is implemented of course) to just look at the number of bits in the operands. Doubling the size of the largest operand is a safe bet if you can do it within your language or processor. Divides are downright expensive (slow) and most processors dont have one much less at an arbitrary doubling of operand sizes. The fastest of course is to drop to assembler do the multiply and look at the overflow bit (or compare one of the result registers with zero). If your processor cant do the multiply in hardware then it is going to be slow no matter what you do. I am guessing that asm is not the right answer to this post despite being by far the fastest and has the most accurate overflow status.

binary makes multiplication trivial compared to decimal, for example take the binary numbers

0b100 *
0b100 

Just like decimal math in school you (can) start with the least significant bit on the lower operand and multiply it against all the locations in the upper operand, except with binary there are only two choices you multiply by zero meaning you dont have to add to the result, or you multiply by one which means you just shift and add, no actual multiplication is necessary like you would have in decimal.

  000 : 0 * 100
 000  : 0 * 100
100   : 1 * 100

Add up the columns and the answer is 0b10000

Same as decimal math a 1 in the hundreds column means copy the top number and add two zeros, it works the same in any other base as well. So 0b100 times 0b110 is 0b1000, a one in the second column over so copy and add a zero + 0b10000 a one in the third column over so copy and add two zeros = 0b11000.

This leads to looking at the most significant bits in both numbers. 0b1xx * 0b1xx guarantees a 1xxxx is added to the answer, and that is the largest bit location in the add, no other single inputs to the final add have that column populated or a more significant column populated. From there you need only more bit in case the other bits being added up cause a carry.

Which happens with the worst case all ones times all ones, 0b111 * 0b111

 
0b00111 +
0b01110 +
0b11100 

This causes a carry bit in the addition resulting in 0b110001. 6 bits. a 3 bit operand times a 3 bit operand 3+3=6 6 bits worst case.

So size of the operands using the most significant bit (not the size of the registers holding the values) determines the worst case storage requirement.

Well, that is true assuming positive operands. If you consider some of these numbers to be negative it changes things but not by much.

Minus 4 times 5, 0b1111...111100 * 0b0000....000101 = -20 or 0b1111..11101100

it takes 4 bits to represent a minus 4 and 4 bits to represent a positive 5 (dont forget your sign bit). Our result required 6 bits if you stripped off all the sign bits.

Lets look at the 4 bit corner cases

-8 * 7 = -56
0b1000 * 0b0111 = 0b1001000 
-1 * 7 = -7 = 0b1001
-8 * -8 = 64 = 0b01000000
-1 * -1 = 2 = 0b010
-1 * -8 = 8 = 0b01000
7 * 7 = 49 = 0b0110001

Lets say we count positive numbers as the most significant 1 plus one and negative the most significant 0 plus one.

-8 * 7 is 4+4=8 bits   actual 7
-1 * 7 is 1+4=5 bits, actual 4 bits
-8 * -8 is 4+4=8 bits, actual 8 bits
-1 * -1 is 1+1=2 bits, actual 3 bits
-1 * -8 is 1+4=5 bits, actual 5 bits
7 * 7 is 4+4=8 bits, actual 7 bits.

So this rule works, with the exception of -1 * -1, you can see that I called a minus one one bit, for the plus one thing find the zero plus one. Anyway, I argue that if this were a 4 bit * 4 bit machine as defined, you would have 4 bits of result at least and I interpret the question as how may more than 4 bits do I need to safely store the answer. So this rule serves to answer that question for 2s complement math.

If your question was to accurately determine overflow and then speed is secondary, then, well it is going to be really really slow for some systems, for every multiply you do. If this is the question you are asking, to get some of the speed back you need to tune it a little better for the language and/or processor. Double up the biggest operand, if you can, and check for non-zero bits above the result size, or use a divide and compare. If you cant double the operand sizes, divide and compare. Check for zero before the divide.

Actually your question doesnt specify what size of overflow you are talking about either. Good old 8086 16 bit times 16 bit gives a 32 bit result (hardware), it can never overflow. What about some of the ARMs that have a multiply, 32 bit times 32 bit, 32 bit result, easy to overflow. What is the size of your operands for this question, are they the same size or are they double the input size? Are you willing to perform multiplies that the hardware cannot do (without overflowing)? Are you writing a compiler library and trying to determine if you can feed the operands to the hardware for speed or if you have to perform the math without a hardware multiply. Which is the kind of thing you get if you cast up the operands, the compiler library will try to cast the operands back down before doing the multiply, depending on the compiler and its library of course. And it will use the count the bit trick determine to use the hardware multiply or a software one.

My goal here was to show how binary multiply works in a digestible form so you can see how much maximum storage you need by finding the location of a single bit in each operand. Now how fast you can find that bit in each operand is the trick. If you were looking for minimum storage requirements not maximum that is a different story because involves every single one of the significant bits in both operands not just one bit per operand, you have to do the multiply to determine minimum storage. If you dont care about maximum or minimum storage you have to just do the multiply and look for non zeros above your defined overflow limit or use a divide if you have the time or hardware.

Your tags imply you are not interested in floating point, floating point is a completely different beast, you cannot apply any of these fixed point rules to floating point, they DO NOT work.

dwelch
You cannot decide an overflow just by looking at the most significant 1 bit of each number: `100_2 * 100_2 = 10000_2` does not overflow a 5 bit value, but `111_2 * 111_2 = 110001_2` does overflow.
Christian Semrau