tags:

views:

1798

answers:

3

xor eax, eax will always set eax to zero, right? So, why does MSVC++ sometimes put it in my executable's code? Is it more efficient that mov eax, 0?

012B1002  in          al,dx 
012B1003  push        ecx  
    int i = 5;
012B1004  mov         dword ptr [i],5 
    return 0;
012B100B  xor         eax,eax

Also, what does it mean to do in al, dx?

+5  A: 

xor eax, eax is a faster way of setting eax to zero. This is happening because you're returning zero.

The in instruction is doing stuff with I\O ports. Basically reading a word of data from the port specified dx in and storing it in al. It's not clear why it is happening here. Here's a reference that seems to explain it in detail.

jeffamaphone
"The in instruction is doing stuff with I\O ports". But in this case, it is probably an "artifact" caused by the debugger starting disassembly in the middle of an instruction.
Stephen C
I agree. But still, that's what it does.
jeffamaphone
+26  A: 

Yes, it is more efficient.

The opcode is shorter than mov eax,0, only 2 bytes, and the processor regonizes the special case and treats it as a mov eax,0 without a false read dependency on eax, so the execution time is the same.

drhirsch
Bingo. This is one of a handful of idioms that tell the processor to bypass the usual dependency logic.
Stephen Canon
"processor regonizes the special case and treats it as a "mov eax,0" without a false read dependency on eax, so the execution time is the same" The processor actually does even better: it just executes a register rename internally, and doesn't even do anything at all with `eax`.
kquinn
Actually, in the big picture it's faster. There are fewer bytes that have to be fetched from RAM.
Loren Pechtel
+1  A: 

Also to avoid 0s when compiled as used on shell codes for exploitation of buffer overflows, etc. Why avoid the 0 ? Well, 0 represents the end of string in c/c++ and the shell code would be truncated if the mean of exploitation is a string processing function or the like.

Btw im referring to the original question: "Any reason to do a “xor eax, eax”?" not what the MSVC++ compiler does.

kripto_ash
This sounds like nonsense to me. There are bound to be zero bytes somewhere in your code, so I don't see how one more would make much difference. Anyway, who cares if you can trick a program into reading code as data. The real problem is executing data as code.
Stephen C
kripto_ash
This brings back memories from the TRS-80. Some of us would embed assembly routines inside BASIC strings. There were a few characters that absolutely could not appear in the source code without breaking it and so any such routine had to be carefully optimized to avoid using those characters.
Loren Pechtel