views:

663

answers:

12

Hello,

I've seen comments on SO saying "<> is faster than =" or "!= faster than ==" in an if() statement.

I'd like to know why is that so. Could you show an example in asm?

Thanks! :)

EDIT:

Source

Here is what he did.

  function Check(var MemoryData:Array of byte;MemorySignature:Array of byte;Position:integer):boolean;
   var i:byte;
   begin
    Result := True; //moved at top. Your function always returned 'True'. This is what you wanted?
    for i := 0 to Length(MemorySignature) - 1 do //are you sure??? Perhaps you want High(MemorySignature) here... 
    begin
{!}  if MemorySignature[i] <> $FF then //speedup - '<>' evaluates faster than '='
     begin
      Result:=memorydata[i + position] <> MemorySignature[i]; //speedup.
      if not Result then 
        Break; //added this! - speedup. We already know the result. So, no need to scan till end.
     end;
    end;
   end;
+18  A: 

I'd claim that this is flat out wrong except perhaps in very special circumstances. Compilers can refactor one into the other effortlessly (by just switching the if and else cases).

Konrad Rudolph
+1  A: 

If you can provide a small example that clearly shows a difference, then I'm sure the Stack Overflow community could explain why. However, I think you might have difficulty constructing a clear example. I don't think there will be any performance difference noticeable at any reasonable scale.

Greg Hewgill
I'm sorry,you can see it now.
John
That's not an example that I can compile and run (even if I had Delphi).
Greg Hewgill
@Greg,its one line only. if(0xFF != 255) return; or If(256 = $FE) return;
John
Yes, I see that, but I don't see proof that there is a difference at all.
Greg Hewgill
+1  A: 

I strongly doubt there is any speed difference. For integral types for example you are getting a CMP instruction and either JZ (Jump if zero) or JNZ (Jump if not zero), depending on whether you used = or ≠. There is no speed difference here and I'd expect that to hold true at higher levels too.

Joey
+4  A: 

Spontaneous though; most other things in your code will affect performance more than the choice between == and != (or = and <> depending on language).

When I ran a test in C# over 1000000 iterations of comparing strings (containing the alphabet, a-z, with the last two letters reversed in one of them), the difference was between 0 an 1 milliseconds.

It has been said before: write code for readability; change into more performant code when it has been established that it will make a difference.

Edit: repeated the same test with byte arrays; same thing; the performance difference is neglectible.

Fredrik Mörk
+2  A: 

I'd claim this was flat out wrong full stop. The test for equality is always the same as the test for inequality. With string (or complex structure testing), you're always going to break at exactly the same point. Until that break point is reached, then the answer for equality is unknown.

seanyboy
+5  A: 

For .Net languages

If you look at the IL from the string.op_Equality and string.op_Inequality methods, you will see that both internall call string.Equals.

But the op_Inequality inverts the result. This is two IL-statements more.

I would say they the performance is the same, with maybe a small (very small, very very small) better performance for the == statement. But I believe that the optimizer & JIT compiler will remove this.

GvS
Your last sentence is important: what you seen in IL may have no bearing on the code eventually executed.
Konrad Rudolph
But it will take 2 IL statements more to JIT (or pre-jit, and optimize), 4 bytes more to load, etc. If you put that all together... ;-)
GvS
+1  A: 

This list (assuming it's on x86) of ASM instructions might help:

(Disclaimer, I have nothing more than very basic experience with writing assembler so I could be off the mark)

However it obviously depends purely on what assembly instructions the Delphi compiler is producing. Without seeing that output then it's guesswork. I'm going to keep my Donald Knuth quote in as caring about this kind of thing for all but a niche set of applications (games, mobile devices, high performance server apps, safety critical software, missile launchers etc.) is the thing you worry about last in my view.

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."

If you're writing one of those or similar then obviously you do care, but you didn't specify it.

Chris S
What has this got to do with the question? If one operator *were* faster than the other, this would be essential to know because of the implications for the architecture, and how we should write code. Even such a micro-optimization would be worthwhile if it were applied consistently.
Konrad Rudolph
Your answer contradicts your comment
Chris S
+1  A: 

Well it could be or it couldn't be, that is the question :-) The thing is this is highly depending on the programming language you are using. Since all your statements will eventually end up as instructions to the CPU, the one that uses the least amount of instruction to achieve the result will be the fastest.

For example if you say bits x is equal to bits y, you could use the instruction that does an XOR using both bits as an input, if the result is anything but 0 it is not the same. So how would you know that the result is anything but 0? By using the instruction that returns true if you say input a is bigger than 0.

So this is already 2 instructions you use to do it, but since most CPU's have an instruction that does compare in a single cycle it is a bad example.

The point I am making is still the same, you can't make this generally statements without providing the programming language and the CPU architecture.

Martin P. Hellwig
+6  A: 

It could have something to do with branch prediction on the CPU. Static branch prediction would predict that a branch simply wouldn't be taken and fetch the next instruction. However, hardly anybody uses that anymore. Other than that, I'd say it's bull because the comparisons should be identical.

Jasper Bekkers
+1  A: 

Just guessing, but given you want to preserve the logic, you cannot just replace

if A = B then

with

if A <> B then

To conserve the logic, the original code must have been something like

if not (A = B) then

or

if A <> B then
else

and that may truely be a little bit slower than the test on inequality.

Uwe Raabe
both are CMP followed by a branch. je or jne. And nearly all CPUs have a zero flag and ways to test it.
Marco van de Voort
It may be optimized by the compiler, but "not (A=B)" seems like a bit more work than "A <> B". While the first is a compare followed by a negation, the second is a compare only. I'm not sure about the "else" version, though.
Uwe Raabe
+2  A: 

It could also be a result of misinterpretation of an experiment.

Most compilers/optimizers assume a branch is taken by default. If you invert the operator and the if-then-else order, and the branch that is now taken is the ELSE clause, that might cause an additional speed effect in highly calculating code (*)

(*) obviously you need to do a lot of operations for that. But it can matter for the tightest loops in e.g. codecs or image analysis/machine vision where you have 50MByte/s of data to trawl through. .... and then I even only stoop to this level for the really heavily reusable code. For ordinary business code it is not worth it.

Marco van de Voort
+5  A: 

I think there's some confusion in your previous question about what the algorithm was that you were trying to implement, and therefore in what the claimed "speedup" purports to do.

Here's some disassembly from Delphi 2007. optimization on. (Note, optimization off changed the code a little, but not in a relevant way.

Unit70.pas.31: for I := 0 to 100 do
004552B5 33C0             xor eax,eax
Unit70.pas.33: if i = j then
004552B7 3B02             cmp eax,[edx]
004552B9 7506             jnz $004552c1
Unit70.pas.34: k := k+1;
004552BB FF05D0DC4500     inc dword ptr [$0045dcd0]
Unit70.pas.35: if i <> j then
004552C1 3B02             cmp eax,[edx]
004552C3 7406             jz $004552cb
Unit70.pas.36: l := l + 1;
004552C5 FF05D4DC4500     inc dword ptr [$0045dcd4]
Unit70.pas.37: end;
004552CB 40               inc eax
Unit70.pas.31: for I := 0 to 100 do
004552CC 83F865           cmp eax,$65
004552CF 75E6             jnz $004552b7
Unit70.pas.38: end;
004552D1 C3               ret

As you can see, the only difference between the two cases is a jz vs. a jnz instruction. These WILL run at the same speed. what's likely to affect things much more is how often the branch is taken, and if the entire loop fits into cache.

Roddy