views:

75

answers:

4

This is a super naive question (I know), but I think that it will make for a good jumping off point into considering how the basic instruction set of a CPU actually gets carried out:

In a two's complement system, you cannot invert the sign of the most negative number that your implementation can represent. The theoretical reason for this is obvious in that the negation of the most negative number would be out of the range of the implementation (the range is always something like
-128 to 127).

However, what actually happens when you try to carry out the negation operation on the most negative number is pretty strange. For example, in an 8 bit representation, the most negative number is -128, or 1000 0000 in binary. Normally, to negate a number you would flip all the bits and then add one. However, if you try to do this with -128 you end up with:

1000 0000 ->
0111 1111 ->
1000 0000

the same number that you started out with. For this reason, wikipedia calls it "the weird number".

In that same wikipedia article, it says that the above negation

is detected as an overflow condition since there was a carry into but not out of the most-significant bit.

So my question is this:
A) What the heck does that mean? and
B) It seems like the CPU would need to perform an extra error checking step each and every time it carried out a basic arithmetic operation in order to avoid accidents relating to this negation, creating significant overhead. If that is the case, why not just truncate the range of numbers that can be represented to leave the weird number out (i.e. -127 to 127 for 8 bits)? If that isn't the case, how can you implement such error checking without creating extra overhead?

+3  A: 

It's not that the CPU does another check, its that the transistors are arranged to notice when this happens. And they are built that way because the engineers picked two-complement before they started designing the thing.

The result is that it happens during the same clock cycle as a non-overflowing result would be returned.


How does it work?

The "add 1" stage implements a cascade logic: starting with the LSB each bit is subjected in turn to the truth table

old-bit  carry-in  new-bit  carry-out
-------------------------------------
   0        0        0         0
   0        1        1         0
   1        0        1         0
   1        1        0         1

(that is new-bit = old-bit xor carry-in and carry-out = old-bit and carry-in). The "carry-in" for the LSB is the 1 that we're adding, and for the rest of the bits it is the "carry-out" of the previous one (which is why this has to be done in a cascade).

The last of these circuits just adds a circuit for signed-overflow = (carry-in and not carry-out).

dmckee
So (avoiding circuit diagrams, if possible) how does the actual logic of that check work?
tel
That truth table is awesome, in the old school sense of the word. You know way too much about this stuff. So does the additional circuit for the MSB slow the processor down at all? I'm trying to grasp at what exactly the relationship is between the circuitry in the ALU (or CPU, or whatever) and how long it takes to perform a task.
tel
Each step in the chain for a particular operation takes a finite time. Somewhere on the chip is a set of circuits for some operation that takes every last pico-second of a single clock. If the `and` in the evaluation of signed-overflow makes this operation the limiting one, then you could argue that it "slows" the chip. But I'll give dollars to donuts that it doesn't for whatever chip you care to name.
dmckee
@dmckee: although might it "slow" the chip in another sense? The extra path introduces an extra thermal overhead, which in turn affects the speed at which you can safely clock the chip. Of course, chips are clocked to round numbers, so I'd expect it to be (for example) a 1-per-million-chips chance of a 1MHz slowdown rather than a 1Hz slowdown on every chip. This assuming a manufacturing process in which chips are tested off the production line, and clocked as fast as that individual unit can handle. Chip fabrication is a game of probabilities, and not one I claim to really understand.
Steve Jessop
@Steve Maybe. The version of this described here is the introductory textbook version, and if the pavement gets any damper I'm going to be out of my depth. It looks like [dwelch is the expert answerer here](http://stackoverflow.com/questions/4046417/wrapping-my-head-around-hardware-representations-of-numbers-a-hypothetical-twos/4047453#4047453).
dmckee
"Does this slow the processor down?" no. the reason is simple to understand. The CPU needs do to something with all of those bits. In the add instruction, the sum is moved into a register, but the overflow condition is left in another condition register, unusable by the add instruction. If it turns out that the handful of gates needed to get that condition flag fully up to date is a little longer than the clock speed being used, that's still ok, because it *will* be ready by the next instruction, possibly a move or conditional branch instruction.
TokenMacGuy
+4  A: 

The carry-out bit from the MSB is used as a flag to indicate that we need more bits. Without it, we would have a system of modular arithmetic1 without any way of detecting when we wrap around.

In modular arithmetic, you don’t deal with numbers but with equivalence classes of numbers that have the same remainder. In such a system, after adding 1 to 127, you would get −128, and you would conclude that +128 and −128 belong to the same equivalence class.

If you restricted yourself to numbers in the range −127 to +127, you would have to redefine addition, since 127 + 1 = −127 is nonsense.

Two’s-complement arithmetic, when presented to you by a computer, is essentially modular arithmetic with the ability to detect an overflow.

This is what a 4-bit adder would look like when adding 0001 to 0111. You can see that in the MSB the carry-in and carry-out are different:

     0        0        0        1
     | 0      | 1      | 1      | 1
     | |      | |      | |      | |
     v v      v v      v v      v v
0 <- ADD <-1- ADD <-1- ADD <-1- ADD <- 0
^     |    ^   |        |        |
      v        v        v        v
      1        0        0        0

It is this flag that the ALU uses to signal that an overflow occurred, without any extra steps.

1. Modular arithmetic goes from 0 to 255 instead of −127 to 128, but the basic idea is the same.

jleedev
Love the diagram. Very helpful. I think I get it now. I still have one question left: although you've clarified why it's completely necessary, does adding a circuit to check the MSB slow down the ALU in any way whatsoever? I'm trying to grasp at what exactly the relationship is between the circuitry in the ALU (or CPU, or whatever) and how long it takes to perform a task.
tel
@tel Real ALUs actually have a dedicated layer to compute the carry bits, instead of the ripple carry I used here. To actually check for overflow, it is up to the software to examine the status register (there’s no interrupt or anything). Basically, you’re asking if it takes extra time to store the status from the ALU to the control register. I would think not, since it can happen in parallel with the rest of the instruction. However, the designers of [the first MIPS chip](http://en.wikipedia.org/wiki/MIPS_architecture#CPU_family) thought otherwise, according to Wikipedia.
jleedev
The fact that this is modular arithmetic also implies that addition and multiplication don't depend on the number being signed or unsigned, the result as a bit pattern is the same.
starblue
+2  A: 

First off the wikipedia article states it cannot be negated from a negative signed number to a signed number. And what they mean is because it takes 9 bits to represent positive 128, which you cannot do with an 8 bit register. If you are going from negative signed to positive unsigned as a conversion, then you have enough bits. And the hardware should give you 0x80 when you negate 0x80 because that is the right answer.

For add, subtract, multiply, etc addition in twos complement is no different than decimal math from elementary school. You line up your binary numbers, add the columns, the result for that column is the least significant digit and the rest is carried over to the next column. So adding a 0b001 to 0b001 for example

 1
001
001
===
010

Add the two ones in the rightmost column, the result is 0b10 (2 decimal), write zero then carry the one, one plus zero plus zero is one, nothing to carry, zero plus zero is zero, the result is 0b010.

The right most column where 1 plus 1 is 0b10 and we write 0 carry the one, well that carry the one is at the same time the carry out of the right most column and is the carry in of the second column. Also, with pencil and paper math we normally only talk about carry of something when it is non-zero but if you think about it you are always carrying a number like our second columns one plus zero is one carry the zero.

You can think of a twos complement negate as invert and add one, or walking the bits and inverting up to a point then not inverting, or taking the result of zero minus the number.

You can work subtract in binary using pencil and paper for what it is worth, makes your head hurt when borrowing compared to decimal, but works. For what you are asking though think of invert and add one.

It is easier to wrap your head around this if you take it down to even fewer bits than 8, three is a manageable number, it all scales from there.

So the first column below is the input, the second column is the inverted version and the third column is the second column plus one. The fourth column is the carry in to the msbit, the fifth column is the carry out of the msbit.

000 111 000 1 1
001 110 111 0 0
010 101 110 0 0
011 100 101 0 0
100 011 100 1 0
101 010 011 0 0
110 001 010 0 0
111 000 001 0 0

Real quick look at at adding a one to two bits:

00+1 = 001
01+1 = 010
10+1 = 011
11+1 = 100

For the case of adding one to a number, the only case where you carry out from the second bit into the third bit is when your bits are all ones, a single zero in there stops the cascading carry bits. So in the three bit inversion table above the only two cases where you have a carry into the msbit is 111 and 011 because those are the only two cases where those lower bits are all set. For the 111 case the msbit has a carry in and a carry out. for the 011 case the msbit has a carry in but not a carry out.

So as stated by someone else there are transistors wired up in the chip, if msbit carry in is set and msbit carry out is not set then set some flag somewhere, otherwise clear the flag.

So note that the three bit examples above scale. if after you invert and before you add one you have 0b01111111 then you are going to get a carry in without the carry out. If you have 0b11111111 then you get a carry in and a carry out. Note that zero is also a number where you get the same number back when you invert it, the difference is that when the bits are considered as signed, zero can be represented when negated, 1 with all zeros cannot.

The bottom line though is that this is not a crisis or end of the world thing there is a whole lot of math and other operations in the processor where carry bits and significant bits are falling off one side or the other and overflows and underflows are firing off, etc. Most of the time the programmers never check for such conditions and those bits just fall on the floor, sometimes causing the program to crash or sometimes the programmer has used 16 bit numbers for 8 bit math just to make sure nothing bad happens, or uses 8 bit numbers for 5 bit math for the same reason.

Note that the hardware doesnt know signed or unsigned for addition and subtraction. also the hardware doesnt know how to subtract. Hardware adders are three bit adders (two operands and carry in) with a result and carry out. Wire 8 of these up you have an 8 bit adder or subtractor, add without carry is the two operands wired directly with a zero wired in as the lsbit carry in. Add with carry is the two operands wired directly with the carry bit wired to the lsbit carry in. Subtract is add with the second operand inverted and a one on the carry in bit. At least from a high level perspective, that logic can all get optimized and implemented in ways often two hard to understand on casual inspection.

The really fun exercise is multiply, think about doing binary multiplication with pencil and paper, then realize it is much easier than decimal, because it is just a series of shifts and adds. given enough gates you can represent each result bit as a equation with the inputs to the equation being the operands. meaning you can do a single clock multiply if you wish, in the early days that was too many gates, so multi clock shift and adds were done, today we burn the gates and get single clock multiplies. Also note that understanding this also means that if you do say a 16 bit = 8 bit times 8 bit multiply, the lower 8 bit result is the same whether it is a signed multiply or unsigned. Since most folks do things like int = int * int; you really dont need a signed or unsigned multiply if all you care about is the result bits (no checking of flags, etc). fun stuff..

dwelch
A: 

In the ARM Architecture Manual (DDI100E):

OverflowFrom
     Returns 1 if the addition or subtraction specified as its parameter 
     caused a 32-bit signed overflow. [...]
     Subtraction causes an overflow if the operands have different signs,
     and the first operand and the result have different signs.

NEG
     [...]
     V Flag = OverflowFrom(0 - Rm)

NEG is the instruction for computing the negation of a number, i.e. the twos complement.

The V flag signals signed overflow and can be used for conditional branching. It's fairly standard across different processor architectures, together with the three other flags Z (zero), C (carry) and N (negative).

For 0 - (-128) = 0 + 128 = -128 the first operand is 0 and the second operand as well as the result is -128, so the condition for overflow is satisfied, and the V flag is set.

starblue