ansaurus

Question

Is "IF" expensive?

Answer 1

A:

It sounds strange to me (I haven't taken CompSci yet - next year hopefully!) that an if should be expensive - although I was debating which was more expensive earlier:

x = 0;
if (y) {
    x = 1;
}

if (y) {
    x = 1;
} else {
    x = 0;
}

Ross 2008-11-24 20:22:09

I would expect compilers to generate the same code for both.

Paul Tomblin 2008-11-24 20:25:02

What about x = y ? 1 : 0; It seems like it would generate the same code as the others as well, but I've seen compilers inline one-liner statements like that whereas the if-else equivalent wouldn't be.

Michel 2008-11-24 20:31:31

Would probably generate the same code in this simple case, but in gneeral, the ? : operator lends itself well to conditional move ops, which are *not* branches (because it doesn't alter program flow), and so suffer none of the penalties.

jalf 2008-11-24 20:34:33

Answer 2

+5 A:

Maybe the branching kills the CPU instruction prefetching?

divideandconquer.se 2008-11-24 20:22:51

Upon my... "research" I learned about jump tables and branching for the switch statements but nothing about the if statements. Could you elaborate a little on that?

pek 2008-11-24 20:24:38

IIRC, the CPU is usually prefetching instructions along a single probable execution path, but an 'if' statement that causes a branch from the predicted execution path it will invalidate the prefetched instructons and the preteching will have to restart.

divideandconquer.se 2008-11-24 20:28:28

Any decent processor should have branch prediction capabilities that will try to guess whether a branch will be taken or not, and prefetch instruction based on the prediction (which is generally quite good). GCC even has C extensions that allow a programmer to provide hints for branch predictors.

mipadi 2008-11-24 20:30:54

Moreover, the CPU usually looks ahead to start executing upcoming instructions early (not just prefetch them), and the compiler tries to reorder instructions, and that becomes dangerous across branches, so you can really kill instruction scheduling with too many branches. Which hurts performance.

jalf 2008-11-24 20:32:27

Answer 3

+9 A:

"Expensive" is a very relative term, especially with relationship to an "if" statement since you also have to take into the account the cost of the condition. That could range anywhere from a few short cpu instructions to testing the result of a function that calls out to a remote database.

I wouldn't worry about it. Unless you're doing embedded programming you probably shouldn't be concerned about the cost of "if" at all. For most programmers it's just not going to ever be the driving factor in your app's performance.

Joel Coehoorn 2008-11-24 20:25:11

Definitely relative... cmp/cond jmp is still faster than a mul on many processors.

Brian Knoblauch 2008-11-24 20:26:24

Yes, I agree that I shouldn't be concerned about it. I'm not trying to optimize anything here. I'm just trying to find out and learn. ;)

pek 2008-11-24 20:28:18

Answer 4

A:

I've never heard this though I'm not a CompSci geek. I Just like to program.

Though I have to imagine that this absolutely depends on which language you are using and which compiler you are using. I don't think this could be answered in a "general sense".

Best Regards,
Frank

Frank V 2008-11-24 20:25:57

Answer 5

+3 A:

The only thing I can imagine this might be referring to is the fact that an if statement generally can result in a branch. Depending on the specifics of the processor architecture, branches can cause pipeline stalls or other less than optimal situations.

However, this is extremely situation specific - most modern processors have branch prediction capabilities that attempt to minimize the negative effects of branching. Another example would be how the ARM architecture (and probably others) can handle conditional logic - the ARM has instruction level conditional execution, so simple conditional logic results in no branching - the instructions simply execute as NOPs if the conditions are not met.

All that said - get your logic correct before worrying about this stuff. Incorrect code is as unoptimized as you can get.

Michael Burr 2008-11-24 20:26:28

Answer 6

+10 A:

Branches, especially on RISC architecture microprocessors, are some of the most expensive instructions. This is because on many architectures, the compiler predicts which path of execution will be taken most likely and puts those instructions next in the executable, so they'll already be in the CPU cache when the branch happens. If the branch goes the other way, it has to go back out to main memory and fetch the new instructions -- that's fairly expensive. On many RISC architectures, all instructions are one cycle except for branch (which is often 2 cycles). We're not talking about a major cost here, so don't worry about it. Also, the compiler will optimize better than you do 99% of the time :) One of the really awesome things about the EPIC architecture (Itanium is an example) is that it caches (and begins processing) instructions from both sides of the branch, then discards the set it doesn't need once the outcome of the branch is known. This saves the extra memory access of a typical architecture in the event that it branches along the unpredicted path.

rmeador 2008-11-24 20:27:48

Answer 7

+3 A:

On the lowest possible level if consists of (after computing all the app-specific prerequisites for particular if):

some test instruction
jump to some place in the code if test succeeds, proceed forwards otherwise.

Costs associated with that:

a low level comparison -- usually 1 cpu operation, super cheap
potential jump -- which can be expensive

Reson why jumps are expensive:

you can jump to arbirary code that lives anywhere in memory, if it turns out that it is not cached by the cpu -- we have a problem, because we need to access main memory, which is slower
modern CPUs do branch predition. They try to guess whether if will succeed or not and execute code ahead in the pipeline, so speed things up. If the prediction fails all computation done ahead by pipeline has to be invalidated. That also is an expensive operation

So to sum up:

If can be expesive, if you really, really, relly care about performance.
You should care about it if and only if you are writing real time raytracer or biological simulation or something similar. There is no reason to care about it in most of the real world.

Marcin 2008-11-24 20:28:50

Answer 8

+2 A:

CPUs are deeply pipelined. Any branch instruction (if/for/while/switch/etc) means that the CPU doesn't really know what instruction to load and run next.

The CPU either stalls while waiting to know what to do, or the CPU takes a guess. In the case of an older CPU, or if the guess is wrong, you'll have to suffer a pipeline stall while it goes and loads the correct instruction. Depending on the CPU this can be as high as 10-20 instructions worth of stall.

Modern CPUs try to avoid this by doing good branch prediction, and by executing multiple paths at the same time, and only keeping the actual one. This helps out a lot, but can only go so far.

Good luck in the class.

Also, if you have to worry about this in real life, you're probably doing OS design, realtime graphics, scientific computing, or something similarly CPU-bound. Profile before worrying.

tfinniga 2008-11-24 20:33:50

Answer 9

+2 A:

Modern processors have long execution pipelines which means that several instructions are executed in various stages at the same time. They may not always know the outcome of one instruction when the next one begins to run. When they run into a conditional jump (if) they sometimes have to wait until the pipeline is empty before they can know which way the instruction pointer should go.

I think of it as a long freight train. It can carry a lot of cargo fast in a straight line, but it corners badly.

Pentium 4 (Prescott) had a famously long pipeline of 31 stages.

More on Wikipedia

Guge 2008-11-24 20:36:14

+1 for the freight train metaphor -- I'll remember that for the next time I need to explain processor pipelines.

Daniel Pryden 2009-08-31 08:03:07

Answer 10

+2 A:

if in itself is not slow. Slowness is always relative i bet for my life that you haven't ever felt the "overhead" of an if-statement. If you are going to make a high-performance code, you migh want to avoid branches anyway. What makes if slow is that the processor is preloading code from after the if based on some heuristic and whatnot. It will also stop pipelines from executing code directly after the if branch instruction in the machine code, since the processor doesn't know yet what path will be taken (in a pipelined processor, multiple instructions are interleaved and executed). Code executed could have to be executed in reverse (if the other branch was taken. it's called branch misprediction), or noop's be filled at those places so that this doesn't happen.

If if is evil, then switch is evil too, and &&, || too. Don't worry about it.

Johannes Schaub - litb 2008-11-24 20:36:46

Answer 11

+45 A:

At the very lowest level (in the hardware), yes, ifs are expensive. In order to understand why, you have to understand how pipelines work.

The current instruction to be executed is stored in something typically called the instruction pointer (IP) or program counter (PC); these terms are synonymous, but different terms are used with different architectures. For most instructions, the PC of the next instruction is just the current PC plus the length of the current instruction. For most RISC architectures, instructions are all a constant length, so the PC can be incremented by a constant amount. For CISC architectures such as x86, instructions can be variable-length, so the logic that decodes the instruction has to figure out how long the current instruction is to find the location of the next instruction.

For branch instructions, however, the next instruction to be executed is not the next location after the current instruction. Branches are gotos - they tell the processor where the next instruction is. Branches can either be conditional or unconditional, and the target location can be either fixed or computed.

Conditional vs. unconditional is easy to understand - a conditional branch is only taken if a certain condition holds (such as whether one number equals another); if the branch is not taken, control proceeds to the next instruction after the branch like normal. For unconditional branches, the branch is always taken. Conditional branches show up in if statements and the control tests of for and while loops. Unconditional branches show up in infinite loops, function calls, function returns, break and continue statements, the infamous goto statement, and many more (these lists are far from exhaustive).

The branch target is another important issue. Most branches have a fixed branch target - they go to a specific location in code that is fixed at compile time. This includes if statements, loops of all sorts, regular function calls, and many more. Computed branches compute the target of the branch at runtime. This includes switch statements (sometimes), returning from a function, virtual function calls, and function pointer calls.

So what does this all mean for performance? When the processor sees a branch instruction appear in its pipeline, it needs to figure out how to continue to fill up its pipeline. In order to figure out what instructions come after the branch in the program stream, it needs to know two things: (1) if the branch will be taken and (2) the target of the branch. Figuring this out is called branch prediction, and it's a challenging problem. If the processor guesses correctly, the program continues at full speed. If instead the processor guesses incorrectly, it just spent some time computing the wrong thing. It now has to flush its pipeline and reload it with instructions from the correct execution path. Bottom line: a big performance hit.

Thus, the reason why if statements are expensive is due to branch mispredictions. This is only at the lowest level. If you're writing high-level code, you don't need to worry about these details at all. You should only care about this if you're writing extremely performance-critical code in C or assembly. If that is the case, writing branch-free code can often be superior to code that branches, even if several more instructions are needed. There are some cool bit-twiddling tricks you can do to compute things such as abs(), min(), and max() without branching.

Adam Rosenfield 2008-11-24 20:38:53

It's not *just* branch mispredicts. Branches also inhibit instruction reordering, at the compiler level, and also to some extent on the CPU level (for an out-of-order CPU, of course). Nice detailed answer though.

jalf 2008-11-24 20:58:21

Thanks for taking the time to add this to SO.

Martin Beckett 2008-11-24 21:08:30

Thank you. Today I learned a something new. ;)

pek 2008-11-24 21:13:17

Answer 12

A:

The most expensive in terms of ALU usage? It uses up CPU registers to store the values to be compared and takes up time to fetch and compare the values each time the if statement is run.

Therefore an optimization of that is to do one comparison and store the result as a variable before the loop is run.

Just trying to interpret your missing words.

2008-11-24 20:42:22

Answer 13

A:

I had this argument with a friend of mine once. He was using a very naive circle algorithm, but claimed his to be faster than mine (The kind that only calculates 1/8th of the circle) because mine used if. In the end, the if statement was replaced with sqrt and somehow that was faster. Perhaps because the FPU has sqrt built in?

Demur Rumed 2008-11-24 20:47:45

Answer 14

+4 A:

Check out the article Better Performance Through Branch Elimination on Cell Performance. Another fun one is this post about branchless selections on the Real Time Collision Detection Blog.

In addition to the excellent answers already posted in response to this question, I'd like to put in a reminder that although "if" statements are considered expensive low-level operations, trying to utilize branch-free programming techniques in a higher level environment, such as a scripting language or a business logic layer (regardless of language), may be ridiculously inappropriate.

The vast majority of the time, programs should be written for clarity first and optimized for performance second. There are numerous problem domains where performance is paramount, but the simple fact is that most developers are not writing modules for use deep in the core of a rendering engine or a high performance fluid dynamics simulation that runs for weeks on end. When the top priority is for your solution to "just work" the last thing on your mind should be whether or not you can save on the overhead of a conditional statement in your code.

Parappa 2008-11-24 21:01:50

Answer 15

A:

As pointed out by many, conditional branches can be very slow on a modern computer.

That being said, there are a whole lot of conditional branches that don't live in if statements, you can't always tell what the compiler will come up with, and worrying about how long basic statements will take is virtually always the wrong thing to do. (If you can tell what the compiler will generate reliably, you may not have a good optimizing compiler.)

David Thornley 2008-11-24 22:45:28

Answer 16

A:

Also note that inside a loop is not necessarily very expensive.

Modern CPU assumes upon first visit of an if-statement, that the "if-body" is to be taken (or said the other way: it also assumes a loop-body to be taken multiple times) (*). Upon second and further visits, it (the CPU) can maybe look into the Branch History Table, and see how the condition was the last time (was it true? was it false?). If it was false the last time, then speculative execution will proceed to the "else" of the if, or beyond the loop.

(*) The rule is actually "forward branch not taken, backward branch taken". In an if-statement, there is only a [forward] jump (to the point after the if-body) if the condition evaluates to false (remember: the CPU anyways assumes to not take a branch/jump), but in a loop, there is maybe a forward branch to the position after the loop (not to be taken), and a backward branch upon repetetion (to be taken).

This is also one of the reasons why a call to a virtual function or a function-pointer-call is not that worse as many assume (http://phresnel.org/blog/)

phresnel 2009-03-13 14:42:09

ansaurus

tags:

views:

answers:

Is "IF" expensive?

related questions