views:

409

answers:

6

If the value after the shift operator is greater than the number of bits in the left-hand operand, the result is undefined. If the left-hand operand is unsigned, the right shift is a logical shift so the upper bits will be filled with zeros. If the left-hand operand is signed, the right shift may or may not be a logical shift (that is, the behavior is undefined).

Can somebody explain me what the above lines mean??

+1  A: 

I suppose the key word is "undefined", which means that the specification does not say what should happen. Most compilers will do something sensible in such cases, but you cannot depend on any behaviour generally. It is usually best to avoid invoking undefined behavior unless the documentation for the compiler you are using states what it does in the specific case.

The first sentence says it's undefined if you try to shift, for example, a 32 bit value by more than 32 bits.

The second says that if you shift an unsigned int right, the left hand bits will get filled with zeros.

The third says that if you shift a signed int right, it is not defined what will be put in the left hand bits.

invariant
Note that, in the third, it says that the *behavior* is undefined, not merely the value.
Brooks Moses
It does say that, and it's wrong to do so.
Steve Jessop
+2  A: 

If the value after the shift operator is greater than the number of bits in the left-hand operand, the result is undefined.

If you try to shift a 32-bit integer by 33 the result is undefined. i.e., It may or may not be all zeros.

If the left-hand operand is unsigned, the right shift is a logical shift so the upper bits will be filled with zeros.

Unsigned data type will be padded with zeros when right shifting.

so 1100 >> 1 == 0110

If the left-hand operand is signed, the right shift may or may not be a logical shift (that is, the behavior is undefined).

If the data type is signed, the behavior is not defined. Signed data types are stored in a special format, where the left most bit indicates positive or negative. So shifting on a signed integer may not do what you expect. See the Wikipedia article for details.

http://en.wikipedia.org/wiki/Logical_shift

konforce
That's a pretty significant, and fairly misleading, simplification of signed data types.
Dennis Zickefoose
http://en.wikipedia.org/wiki/Signed_number_representations
konforce
@Dennis: I'd use a stronger phrase than "fairly misleading", since some of the statements are just wrong.
David Thornley
I was just explaining what the quotes were saying, as that's all the OP asked. For all I know, he already knew what the specs were and was trying to determine if the author was correct. After all, the OP is a codeguru. ;)Regardless, I knew somebody with the entire C++ standard language specifications memorized would stop by and regurgitate the whole thing for him.
konforce
+4  A: 

I'm assuming you know what it means by shifting. Lets say you're dealing with a 8-bit chars

unsigned char c;
c >> 9;
c >> 4;
signed char c;
c >> 4;

The first shift, the compiler is free to do whatever it wants, because 9 > 8 [the number of bits in a char]. Undefined behavior means all bets are off, there is no way of knowing what will happen. The second shift is well defined. You get 0s on the left: 11111111 becomes 00001111. The third shift is, like the first, undefined.

Note that, in this third case, it doesn't matter what the value of c is. When it refers to signed, it means the type of the variable, not whether or not the actual value is greater than zero. signed char c = 5 and signed char c = -5 are both signed, and shifting to the right is undefined behavior.

Dennis Zickefoose
Please note that, whatever this language is, it isn't C++. This directly contradicts the Standard.
David Thornley
Aside from the re-use of `c` in the code example, this is indeed what the quoted passage says (and means). So +1 one for that. -1 for not noticing that the quoted passage is a load of old rubbish ;-)
Steve Jessop
The fact that the passage was incorrect went right over my head. Its definitely what I remember the behavior being; I wonder if I read that same book at some point and have been living a lie ever since?
Dennis Zickefoose
@Dennis: Rest assured that reading a book and carrying around a mistaken belief for years only happen to you, and never to the other six billion of us. You're unique in that.
David Thornley
@David: I know, that's why I'm so upset.
Dennis Zickefoose
+4  A: 

If the value after the shift operator is greater than the number of bits in the left-hand operand, the result is undefined.

It means (unsigned int)x >> 33 can do anything[1].

If the left-hand operand is unsigned, the right shift is a logical shift so the upper bits will be filled with zeros.

It means 0xFFFFFFFFu >> 4 must be 0x0FFFFFFFu

If the left-hand operand is signed, the right shift may or may not be a logical shift (that is, the behavior is undefined).

It means 0xFFFFFFFF >> 4 can be 0xFFFFFFFF (arithmetic shift) or 0x0FFFFFFF (logical shift) or anything-allowed-by-physical-law, i.e. the result is undefined.

[1]: on 32-bit machine with a 32-bit int.

KennyTM
"Can be anything" is not entirely correct -- more accurate would be to say that "executing `(int)x >> 33` can do anything", including put your program into a state where it executes unrelated code incorrectly. Also, in the third case, if it's saying that the behavior is "undefined" rather than "implementation-defined", then that one can do anything as well.
Brooks Moses
The standard does not specify that the results of UB must follow physical law.
Noah Roberts
In C++, the result of overshifting is undefined, the value of a signed integer type that is in fact negative that is right-shifted is implementation-defined (although the bit pattern is defined), and everything else is well defined.
David Thornley
+2  A: 

To give some context, here's the start of that paragraph:

The shift operators also manipulate bits. The left-shift operator (<<) produces the operand to the left of the operator shifted to the left by the number of bits specified after the operator. The right-shift operator (>>) produces the operand to the left of the operator shifted to the right by the number of bits specified after the operator.

Now the rest, with explanations:

If the value after the shift operator is greater than the number of bits in the left-hand operand, the result is undefined.

If you have a 32 bit integer and you try to bit shift 33 bits, that's not allowed and the result is undefined. In other words, the result could be anything, or your program could crash.

If the left-hand operand is unsigned, the right shift is a logical shift so the upper bits will be filled with zeros.

This says that it's defined to write a >> b when a is an unsigned int. As you shift right, the least significant bits are removed, other bits are shifted down, and the most significant bits become zero.

In other words:

This:    110101000101010 >> 1
becomes: 011010100010101

If the left-hand operand is signed, the right shift may or may not be a logical shift (that is, the behavior is undefined).

Actually I believe that the behaviour here is implementation defined when a is negative and defined when a is positive rather than undefined as suggested in the quote. This means that if you do a >> b when a is a negative integer, there are many different things that might happen. To see which you get, you should read the documentation for your compiler. A common implementation is to shift in zeros if the number is positive, and ones if the number is negative, but you shouldn't rely on this behaviour if you wish to write portable code.

Mark Byers
If the left-hand operand is signed, then either it's nonnegative (and the result is defined) or it's negative (and the resulting value is implementation-defined).
David Thornley
@David: Yep, I know that know after reading Steve's answer, but thanks anyway. I'll update the post. :)
Mark Byers
+13  A: 

It doesn't matter too much what those lines mean, they are substantially incorrect.

"If the value after the shift operator is greater than the number of bits in the left-hand operand, the result is undefined."

Is true, but should say "greater than or equal to". 5.8/1:

... the behavior is undefined if the right hand operand is negative, or greater than or equal to the length in bits of the promoted left operand.

Undefined behavior means "don't do it" (see later). That is, if int is 32 bits on your system, then you can't validly do any of the following:

int a = 0; // this is OK
a >> 32;   // undefined behavior
a >> -1;   // UB
a << 32;   // UB
a = (0 << 32); // Either UB, or possibly an ill-formed program. I'm not sure.

"If the left-hand operand is unsigned, the right shift is a logical shift so the upper bits will be filled with zeros."

This is true. 5.8/3 says:

If E1 has unsigned type or if E1 has a signed type and a nonnegative value, the result is the integral part of the quotient of E1 divided by the quantity 2 raised to the power E2

if that makes any more sense to you. >>1 is the same as dividing by 2, >>2 dividing by 4, >>3 by 8, and so on. In a binary representation of a positive value, dividing by 2 is the same as moving all the bits one to the right, discarding the smallest bit, and filling in the largest bit with 0.

"If the left-hand operand is signed, the right shift may or may not be a logical shift (that is, the behavior is undefined)."

First part is true (it may or may not be a logical shift - it is on some compilers/platforms but not others. I think by far the most common behaviour is that it is not). Second part is false, the behavior is not undefined. Undefined behavior means that anything is permitted to happen - a crash, demons flying out of your nose, a random value, whatever. The standard doesn't care. There are plenty of cases where the C++ standard says behavior is undefined, but this is not one of them.

In fact, if the left hand operand is signed, and the value is positive, then it behaves the same as an unsigned shift.

If the left hand operand is signed, and the value is negative, then the resulting value is implementation-defined. It isn't allowed to crash or catch fire. The implementation must produce a result, and the documentation for the implementation must contain enough information to define what the result will be. In practice, the "documentation for the implementation" starts with the compiler documentation, but that might refer you implicitly or explicitly to other docs for the OS and/or the CPU.

Again from the standard, 5.8/3:

If E1 has signed type and negative value, the resulting value is implementation-defined.

Steve Jessop
I'd +1 this but I ran out of votes!
Mark Byers
@Mark: I've got you covered. :) ...crap! I want to +1 this but I used it for Mark!
GMan
@GMan: I +1ed for you. Now someone's going to have to cover for me.
James McNellis
If it makes anyone feel better, I've hit the cap for the day :-)
Steve Jessop
@Steve: Oh goodie. :) @James: Thanks for your selfless and apparently inane sacrifice. :)
GMan