views:

268

answers:

6

I was helping someone with their homework and ran into this strange issue. The problem is to write a function that reverses the order of bytes of a signed integer(That's how the function was specified anyway), and this is the solution I came up with:

int reverse(int x)
{
    int reversed = 0;

    reversed = (x & (0xFF << 24)) >> 24;
    reversed |= (x & (0xFF << 16)) >> 8;
    reversed |= (x & (0xFF << 8)) << 8;
    reversed |= (x & 0xFF) << 24;

    return reversed;
}

If you pass 0xFF000000 to this function, the first assignment will result in 0xFFFFFFFF. I don't really understand what is going on, but I know it has something to do with conversions back and forth between signed and unsigned, or something like that.

If I either append ul to 0xFF it works fine, which I assume is because it's forced to unsigned then converted to signed or something in that direction. The resulting code also changes; without the ul specifier it uses sar(shift arithmetic right), but as unsigned it uses shr as intended.

I would really appreciate it if someone could shed some light on this for me. I'm supposed to know this stuff, and I thought I did, but I'm really not sure what's going on here.

Thanks in advance!

+10  A: 

Since x is a signed quantity, the result of (x & (0xFF << 24)) is 0xFF000000 which is also signed and thus a negative number since the top (sign) bit is set. The >> operator on int (a signed value) performs sign extension (Edit: though this behaviour is undefined and implementation-specific) and propagates the sign bit value of 1 as the value is shifted to the right.

You should rewrite the function as follows to work exclusively on unsigned values:

unsigned reverse(unsigned x)
{
    unsigned int reversed = 0;

    reversed = (x & (0xFF << 24)) >> 24;
    reversed |= (x & (0xFF << 16)) >> 8;
    reversed |= (x & (0xFF << 8)) << 8;
    reversed |= (x & 0xFF) << 24;

    return reversed;
}
Richard Cook
I wish comments could be downvoted. Bits 31-24 swap with 7-0 (shift 24), 23-16 swap with 15-8 (shift 8, not 16). Rolled back to the correct algorithm.
Ben Voigt
Sorry about that...
Jason R. Mick
It would be worth mentioning that the left shift overflowing into the sign bit has **undefined behavior** and the subsequent right shift (with sign extension) is **implementation specific** behavior. To make a long story short, **you should be using unsigned types here**.
R..
It is incorrect to say that the `>>` operator performs sign extension. In fact the result of the `>>` operator on a negative, signed value is implementation-defined (which in practice means that it can either perform sign extension or not).
caf
@R..: That is true for C. In C++ the behavior is not undefined, although the _value_ of the result does depend on the implementation specific representation of signed integer types.
Charles Bailey
Yes -- I told my friend that I knew that signed types would cause grief here before I started helping him out with the problem, but they have no choice. We all know what programming teachers can be like. But thanks for all the answers and comments, guys. I always like learning something new.
identity
+1  A: 

x is signed, so the highest bit is used for the sign. 0xFF000000 means "negative 0x7F000000". When you do the shift, the result is "sign extended": The binary digit that is added on the left to replace the former MSB that was shifted right, is always the same as the sign of value. So

0xFF000000 >> 1 == 0xFF800000
0xFF000000 >> 2 == 0xFFC00000
0xFF000000 >> 3 == 0xFFE00000
0xFF000000 >> 4 == 0xFFF00000

If the value being shifted is unsigned, or if the shift is toward the left, the new bit would be 0. It's only in right-shifts of signed values that sign-extension come into play.

James Curran
0xFF000000 != -0x7F000000 (at least, not as a rule, processors that use independent sign bits for integers rather than complement representation have to be < 1% of the market). If it was, sign-extension wouldn't be necessary, sign preservation would be.
Ben Voigt
For the record, on 32-bit 2's complement machines such as the one which the OP is clearly using, `0xFF000000` is `-0x01000000`. In 2's complement, `-x` is equivalent to `(~x) + 1`. That is, complement every bit, then add 1 to the result.
RBerteig
True, but that doesn't significantly change my answer, but just drags in some extranous detail irrelevant to the question. Had I brought in up, it would have just made a simple answer more confusing.
James Curran
+7  A: 

From your results we can deduce that you are on a 32-bit machine.

(x & (0xFF << 24)) >> 24

In this expression 0xFF is an int, so 0xFF << 24 is also an int, as is x.

When you perform the bitwise & between two int, the result is also an int and in this case the value is 0xFF000000 which on a 32-bit machine means that the sign bit is set, so you have a negative number.

The result of performing a right-shift on an object of signed type with a negative value is implementation-defined. In your case, as sign-preserving arithmetic shift right is performed.

If you right-shift an unsigned type, then you would get the results that you were expecting for a byte reversal function. You could achieve this by making either operand of the bitwise & operand an unsigned type forcing conversion of both operands to the unsigned type. (This is true on any implementation where an signed int can't hold all the possible range of positive values of an unsigned int which is nearly all implementations.)

Charles Bailey
+3  A: 

Right shift on signed types is implementation defined, in particular the compiler is free to do an arithmetic or logical shift as pleases. This is something you will not notice if the concrete value that you are treating is positive, but as soon as it is negative you may fall into a trap.

Just don't do it, this is not portable.

Jens Gustedt
Technically, only right shift on *negative* signed quantities is implementation-defined. Right shift of positive numbers has defined behaviour.
caf
@caf: right, I'll make my answer a bit more precise.
Jens Gustedt
A: 

If this is java code you should use '>>>' which is an unsigned right shift, otherwise it will sign extend the value

Andrew Hudson
It's C/C++, hence the tag.
Andrew Dunn
A: 

If you want it to work the same on al platforms with both signed and unsigned integers, change

(x & (0xFF << 24)) >> 24

into

(x >> 24) & 0xFF
Bart van Heukelom