views:

359

answers:

7

When I read this question I remembered someone once telling me (many years ago) that from an assembler-point-of-view, these two operations are very different:

n = 0;

n = n - n;

Is this true, and if it is, why is it so?

EDIT: As pointed out by some replies, I guess this would be fairly easy for a compiler to optimize into the same thing. But what I find interesting is why they would differ if the compiler had a completely general approach.

+3  A: 

An optimizing compiler will produce the same assembly code for the two.

Eli Bendersky
+2  A: 

It may depend on whether n is declared as volatile or not.

mouviciel
True, but I can't think of a real-life case where one will make n volatile and then do n = n - n
Eli Bendersky
Sure, but I can't think of a real-life case where one will do n=n-n in the first place.
mouviciel
Thanks for the reply, but using "volatile" is also very "real-life" to me at least. This is just a theoretical/hypothetical question for educational purposes.
sharkin
+7  A: 

Compiler VC++ 6.0, without optimisations:

4:        n = 0;
0040102F   mov         dword ptr [ebp-4],0
5:
6:        n = n - n;
00401036   mov         eax,dword ptr [ebp-4]
00401039   sub         eax,dword ptr [ebp-4]
0040103C   mov         dword ptr [ebp-4],eax
anon
+9  A: 

Writing assembler code you often used:

xor eax, eax

instead of

mov eax, 0

That is because with the first statement you have only the opcode and no involved argument. Your CPU will do that in 1 cylce (instead of 2). I think your case is something similar (although using sub).

tanascius
Yes, you could say sub eax,eax. The only difference is the flags that get set by the operation.
anon
You can't really be that sure about *cycles*. The reason is not really cycles, directly. xor eax,eax produces a shorter (3 bytes: 6631C0) instruction than mov eax,0 (6 bytes: 66B800000000) on x86 architecture. sub eax,eax also produces a 3 byte instruction. While for current processors there's not much difference between a sub and xor, xor requires a much simpler circuit and has potential to be faster
Mehrdad Afshari
absolutely correct, this is all about implicit mnemonic parameters and thus reduced instruction size.
none
+1  A: 

not sure about assembly and such, but generally,

n=0
n=n-n

isnt always equal if n is floating point, see here http://www.codinghorror.com/blog/archives/001266.html

Sujoy
If n is an infinity, or a NaN - yes.
Jonathan Leffler
+3  A: 

In the early days, memory and CPU cycles were scarce. That lead to a lot of so called "peep-hole optimizations". Let's look at the code:

move.l #0, d0

moveq.l #0, d0

sub.l a0,a0

The first instruction would need two bytes for the op-code and then four bytes for the value (0). That meant four bytes wasted plus you'd need to access the memory twice (once for the opcode and once for the data). Sloooow.

moveq.l was better since it would merge the data into the op-code but it only allowed to write values between 0 and 7 into a register. And you were limited to data registers only, there was no quick way to clear an address register. You'd have to clear a data register and then load the data register into an address register (two op-codes. Bad.).

Which lead to the last operation which works on any register, need only two bytes, a single memory read. Translated into C, you'd get

n = n - n;

which would work for most often used types of n (integer or pointer).

Aaron Digulla
Are you saying that the n = n-n variant actually is/was more efficient than n = 0?
sharkin
That will usually be the case if the number is already in a register
stephan
Amazing. This is exactly the kind of answer I hoped to get.
sharkin
@R.A.: Yes, n-n is more efficient on M68000 CPUs for address registers. Moveq.l is faster for data registers since the m68k had only a 16bit ALU but sub.l is more general. Both need 16bit of memory. Funnily, clr.l (set register to 0) is slower than moveq.l ;)
Aaron Digulla
+2  A: 

The assembly-language technique of zeroing a register by subtracting it from itself or XORing it with itself is an interesting one, but it doesn't really translate to C.

Any optimising C compiler will use this technique if it makes sense, and trying to write it out explicitly is unlikely to achieve anything.

Artelius