Does the C++ compiler optimize this operation?
I would love to believe that yes.
Does the C++ compiler optimize this operation?
I would love to believe that yes.
Yes. They also optimize other similar operations, such as multiplying by non-powers of two that can be rewritten as the sums of some shifts. They will also optimize divisions by powers of 2 into right-shifts, but beware that when working with signed integers, the two operations are different! The compiler has to emit some extra bit twiddling instructions to make sure the results are the same for positive and negative numbers, but it's still faster than doing a division. It also similarly optimizes moduli by powers of 2.
Actually VS2008 optimizes this to x+x:
01391000 push ecx
int x = 0;
scanf("%d", &x);
01391001 lea eax,[esp]
01391004 push eax
01391005 push offset string "%d" (13920F4h)
0139100A mov dword ptr [esp+8],0
01391012 call dword ptr [__imp__scanf (13920A4h)]
int y = x * 2;
01391018 mov ecx,dword ptr [esp+8]
0139101C lea edx,[ecx+ecx]
In an x64 build it is even more explicit and uses:
int y = x * 2;
000000013FB9101E mov edx,dword ptr [x]
printf("%d", y);
000000013FB91022 lea rcx,[string "%d" (13FB921B0h)]
000000013FB91029 add edx,edx
This is will the optimization settings on 'Maximize speed' (/O2)
VS 2008 optimized mine to x << 1.
x = x * 2;
004013E7 mov eax,dword ptr [x]
004013EA shl eax,1
004013EC mov dword ptr [x],eax
EDIT: This was using VS default "Debug" configuration with optimization disabled (/Od). Using any of the optimization switches (/O1, /O2 (VS "Retail"), or /Ox) results in the the add self code Rob posted. Also, just for good measure, I verified x = x << 1
is indeed treated the same way as x = x * 2
by the cl compiler in both /Od and /Ox. So, in summary, cl.exe version 15.00.30729.01 for x86 treats * 2
and << 1
identically and I expect nearly all other recent compilers do the same.
It depends on what compiler you have. Visual C++ for example is notoriously poor in optimizing. If you edit your post to say what compiler you are using, it would be easier to answer.
The answer is "if it is faster" (or smaller). This depends on the target architecture heavily as well as the register usage model for a given compiler. In general, the answer is "yes, always" as this is a very simple peephole optimization to implement and is usually a decent win.
Unless something is specified in a languages standard you'll never get a guaranteed answer to such a question. When in doubt have your compiler spit out assemble code and check. That's going to be the only way to really know.
This article from Raymond Chen could be interesting:
When is x/2 different from x>>1? : http://blogs.msdn.com/oldnewthing/archive/2005/05/27/422551.aspx
Quoting Raymond:
Of course, the compiler is free to recognize this and rewrite your multiplication or shift operation. In fact, it is very likely to do this, because x+x is more easily pairable than a multiplication or shift. Your shift or multiply-by-two is probably going to be rewritten as something closer to an add eax, eax instruction.
[...]
Even if you assume that the shift fills with the sign bit, The result of the shift and the divide are different if x is negative.
(-1) / 2 ≡ 0
(-1) >> 1 ≡ -1[...]
The moral of the story is to write what you mean. If you want to divide by two, then write "/2", not ">>1".
We can only assume it is wise to tell the compiler what you want, not what you want him to do: The compiler is better than an human is at optimizing small scale code (thanks for Daemin to point this subtle point): If you really want optimization, use a profiler, and study your algorithms' efficiency.
I'm sure they all do these kind of optimizations, but I wonder if they are still relevant. Older processors did multiplication by shifting and adding, which could take a number of cycles to complete. Modern processors, on the other hand, have a set of barrel-shifters which can do all the necessary shifts and additions simultaneously in one clock cycle or less. Has anyone actually benchmarked whether these optimizations really help?
@Ferruccio Barletta
That's a good question. I went Googling to try to find the answer.
I couldn't find answers for Intel processors directly, but this page has someone who tried to time things. It shows shifts to be more than twice as fast as ads and multiplies. Bit shifts are so simple (where a multiply could be a shift and an addition) that this makes sense.
So then I Googled AMD, and found an old optimization guide for the Athlon from 2002 that lists that lists the fastest ways to multiply numbers by contants between 2 and 32. Interestingly, it depends on the number. Some are ads, some shifts. It's on page 122.
A guide for the Athlon 64 shows the same thing (page 164 or so). It says multiplies are 3 (in 32-bit) or 4 (in 64-bit) cycle operations, where shifts are 1 and adds are 2.
It seems it is still useful as an optimization.
Ignoring cycle counts though, this kind of method would prevent you from tying up the multiplication execution units (possibly), so if you were doing lots of multiplications in a tight loop where some use constants and some don't the extra scheduling room might be useful.
But that's speculation.
That's only the start of what optimizers can do. To see what your compiler does, look for the switch that causes it to emit assembler source. For the Digital Mars compilers, the output assembler can be examined with the OBJ2ASM tool. If you want to learn how your compiler works, looking at the assembler output can be very illuminating.