ansaurus

Question

Why use only the lower five bits of the shift operand when shifting a 32-bit value? (e.g. (UInt32)1 << 33 == 2)

Answer 1

+4 A:

It basically boils down to the way the x86 handles the arithmetic shift opcodes: it only uses the bottom 5 bits of the shift count. See the 80386 programming guide, for example. In C/C++, it's technically undefined behavior to do a bit shift by more than 31 bits (for a 32-bit integer), going with the C philosophy of "you don't pay for what you don't need". From section 6.5.7, paragraph 3 of the C99 standard:

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

This allows compilers to omit a single shift instruction on x86 for shifts. 64-bit shifts cannot be done in one instruction on x86. They use the SHLD/SHRD instructions plus some additional logic. On x86_64, 64-bit shifts can be done in one instruction.

For example, gcc 3.4.4 emits the following assembly for a 64-bit left-shift by an arbitrary amount (compiled with -O3 -fomit-frame-pointer):

uint64_t lshift(uint64_t x, int r)
{
  return x << r;
}

_lshift:
    movl    12(%esp), %ecx
    movl    4(%esp), %eax
    movl    8(%esp), %edx
    shldl   %cl,%eax, %edx
    sall    %cl, %eax
    testb   $32, %cl
    je      L5
    movl    %eax, %edx
    xorl    %eax, %eax
L5:
    ret

Now, I'm not very familiar with C#, but I'm guessing it has a similar philosophy -- design the language to allow it to be implemented as efficiently as possible. By specifying that shift operations only use the bottom 5/6 bits of the shift count, it permits the JIT compiler to compile the shifts as optimally as possible. 32-bit shifts, as well as 64-bit shifts on 64-bit systems, can get JIT compiled into a single opcode.

If C# were ported to a platform that had different behavior for its native shift opcodes, then this would actually incur an extra performance hit -- the JIT compiler would have to ensure that the standard is respected, so it would have to add extra logic to make sure only the bottom 5/6 bits of the shift count were used.

Adam Rosenfield 2009-03-13 21:44:41

Then what is the explanation for shift33a? Is this a backwards compatibility thing?

Daniel LeCheminant 2009-03-13 21:46:56

I doubt C#, being a CLR language, should pay much attention to how x68 behaves. Sure, it's the only platform it was officially supported on, but still, a VM doesn't need much compat to existing processors.

Joey 2009-03-13 21:49:22

I see nothing in ISO C99 (draft) about bitshifts >31 being undefined, nor do I see why that would be the case. C is (in)famously wordsize agnostic.

Ken 2009-03-13 22:06:40

The IL appears to be able to do shifts of more than 31 bits (this was the case for shift33a, which became `L_0010: ldc.i4.s 0x21`, which had a result of 0)

Daniel LeCheminant 2009-03-13 22:43:35

Answer 2

+1 A:

I wrote this simple test in C (gcc, linux), and got similar results. What is interesting though, is that the constant defines for over shifting were turned into zero instead of wrapping around. It did give a warning about those, so at least there was some recognition that it was an "incorrect" thing to do.

#include <stdio.h>

unsigned int is0 = 1 << 31;
unsigned int is1 = 1 << 32;
unsigned int is2 = 1 << 33;

int main()
{
   unsigned int loopy = 0;
   int x = 0;
   printf("0x%08X\n", is0);
   printf("0x%08X\n", is1);
   printf("0x%08X\n", is2);


   for (x = 0; x < 35; ++x)
   {
      loopy = 1 << x;
      printf("%02d 0x%08X\n", x,loopy);
   }

   return 0;
}

Here are the results:

0x80000000
0x00000000
0x00000000
00 0x00000001
01 0x00000002
02 0x00000004
03 0x00000008
04 0x00000010
05 0x00000020
06 0x00000040
07 0x00000080
08 0x00000100
09 0x00000200
10 0x00000400
11 0x00000800
12 0x00001000
13 0x00002000
14 0x00004000
15 0x00008000
16 0x00010000
17 0x00020000
18 0x00040000
19 0x00080000
20 0x00100000
21 0x00200000
22 0x00400000
23 0x00800000
24 0x01000000
25 0x02000000
26 0x04000000
27 0x08000000
28 0x10000000
29 0x20000000
30 0x40000000
31 0x80000000
32 0x00000001
33 0x00000002
34 0x00000004

grieve 2009-03-13 22:04:13

Answer 3

+3 A:

Unit32 overflows at 32 bits, that is defined in the spec. What are you expecting?

The CLR does not define a left shift with overflow detection operator(1). If you need that kind of facility you need to check for yourself.

(1) The C# compiler might cast it to long, but I am not sure.

leppie 2009-03-13 22:32:21

Well, intuitively, I would expect that left shifting 1 32 bits to the left would come out 0 (because of the overflow), and not 1.

Daniel LeCheminant 2009-03-13 22:38:57

It's undefined, the CLR can do what it wants. I dont make the rules... :)

leppie 2009-03-13 22:45:40

Yeah, looking at http://download.microsoft.com/download/D/C/1/DC1B219F-3B11-4A05-9DA3-2D0F98B20917/Partition%20III%20CIL.doc: If the “Shift-By” operand is larger than the width of the “To-Be-Shifted” operand, then the results are unspecified."

Daniel LeCheminant 2009-03-13 22:51:32

I have been looking through you question again, but I cannot understand what you mean by 5 bits, where do you get that from?

leppie 2009-03-13 22:55:53

I mean that only the lower 5 bits of the shift amount are actually used when determining how many bits to shift the value by. So a shift of 33 on a 32-bit value is effectively a shift of 1 (because the lower five bits of 33 == 00001)

Daniel LeCheminant 2009-03-13 23:09:50

@leppie: I've adjusted the title and question to make that a little more clear

Daniel LeCheminant 2009-03-13 23:12:09

ansaurus

tags:

views:

answers:

Why use only the lower five bits of the shift operand when shifting a 32-bit value? (e.g. (UInt32)1 << 33 == 2)

related questions