views:

347

answers:

2

Question

What is the (non-trivial) difference between the following two x86 instructions?

39 /r    CMP r/m32,r32   Compare r32 with r/m32
3B /r    CMP r32,r/m32   Compare r/m32 with r32

Background

I'm building a Java assembler, which will be used by my compiler's intermediate language to produce Windows-32 executables.

Currently I have following code:

final ModelBase mb = new ModelBase(); // create new memory model
mb.addCode(new Compare(Register.ECX, Register.EAX)); // add code
mb.addCode(new Compare(Register.EAX, Register.ECX)); // add code

final FileOutputStream fos = new FileOutputStream(new File("test.exe"));
mb.writeToFile(fos);
fos.close();

To output a valid executable file, which contains two CMP instruction in a TEXT-section. The executable outputted to "text.exe" will do nothing interesting, but that's not the point. The class Compare is a wrapper around the CMP instruction.

The above code produces (inspecting with OllyDbg):

Address   Hex dump                 Command
0040101F  |.  3BC8                 CMP ECX,EAX
00401021  |.  3BC1                 CMP EAX,ECX

The difference is subtle: if I use the 39 byte-opcode:

Address   Hex dump                 Command
0040101F  |.  39C1                 CMP ECX,EAX
00401021  |.  39C8                 CMP EAX,ECX

Which makes me wonder about their synonymity and why this even exists.

+2  A: 

CMP ECX,EAX is ECX-EAX and CMP EAX,ECX is EAX-ECX. The flags are set differently depending on which operand is compared to which. Of course you probably could get away with only one of them if it weren't for the mod/r-m structure of x86 instructions.

Jens Björnhager
The point is that you can encode the same mnemonic in two different ways because there's a different opcode for `cmp r/m, r` and `cmp r, r/m`. The question is if the operand in MOD/RM that can be a memory operand is src1 or src2, and that depends on the opcode.
Nathan Fellman
+10  A: 

It doesn't matter which opcode you use if you compare two registers. The only difference is when comparing a register with a memory operand, as the opcode used determines which will be subtracted from which.

As for why this exists: The x86 instruction format uses the ModR/M byte to denote either a memory address or a register. Each instruction can only have one ModR/M value, which means it can only access one memory address (not including special instructions like MOVSB). So this means that there can't be a general cmp r/m32, r/m32 instruction, and we need two different opcodes: cmp r/m32, r32 and cmp r32, r/m32. As a side effect, this creates some redundancy when comparing two registers.

interjay
These 1-bit degrees of freedom also provide a covert channel for compilers to "phone home" - they can "watermark" the binaries they produce, and the compiler vendor can ask you to please explain if they find your software with their watermark, but with no license on file.
Bernd Jendrissek