views:

67

answers:

3

Will a compiler optimize tihs:

bool someCondition = someVeryTimeConsumingTask(/* ... */);

for (int i=0; i<HUGE_INNER_LOOP; ++i)
{
    if (someCondition)
        doCondition(i);
    else
        bacon(i);
}

into:

bool someCondition = someVeryTimeConsumingTask(/* ... */);

if (someCondition)
    for (int i=0; i<HUGE_INNER_LOOP; ++i)
        doCondition(i);
else
    for (int i=0; i<HUGE_INNER_LOOP; ++i)
        bacon(i);

someCondition is trivially constant within the for loop.

This may seem obvious and that I should do this myself, but if you have more than one condition then you are dealing with permuatations of for loops, so the code would get quite a bit longer. I am deciding on whether to do it (I am already optimizing) or whether it will be a waste of my time.

A: 

Have you profiled your app to find out where the slowdowns are? If not, why are you even thinking about optimization? Until you know which methods need to be optimized, you're wasting your time worrying about micro-optimizations like this.

Is this the location of the slowdown? If so, then what you're doing may be useful. Yes, the compiler may optimize this, but there's no guarantee that it does. If this isn't the location of the slowdown, then look elsewhere; the cost of one additional branch every time through the loop is probably trivial relative to all of the other work you're doing.

JSBangs
Yeah, I have profiled it. I am in the optimization stage, looking for any potential improvement.I need it to be faster: either I can be stupid and implement a hackish macro way around this, or I can be smart and see if its already being done for me: I'm not micro-optimizing.
+2  A: 

It does not seem to do so with either -O2 or -O3 optimisations. This is something you can (and should, if you are concerned with optimisation) test for yourself - compile with the optimisation you are interested in and examine the emitted assembly language.

anon
I don't know x86 assembly, and I don't know if it will do it in my specific case. I'm on -O2, so I think you're right and it won't make the optimization.
+1  A: 

It's possible that the compiler might write the code as you did, but I've never seen such optimization.

However there is something called branch prediction in modern CPU. In essence it means that when the processor is asked to execute a conditional jump, it'll start to execute what is judged to be the likeliest branch before evaluating the condition. This is done to keep the pipeline full of instructions.

In case the processor fails (and takes the bad branch) it cause a flush of the pipeline: it's called a misprediction.

A very common trait of this feature is that if the same test produce the same result several times in a row, then it'll be considered to produce the same result by the branch prediction algorithm... which is of course tailored for loops :)

It makes me smile because you are worrying about the if within the for body while the for itself causes a branch prediction >> the condition must be evaluated at each iteration to check whether or not to continue ;)

So, don't worry about it, it costs less than a cache miss.

Now, if you really are worried about this, there is always the functor approach.

typedef void (*functor_t)(int);

functor_t func = 0;
if (someCondition) func = &doCondition;
else func = &bacon;

for (int i=0; i<HUGE_INNER_LOOP; ++i) (*func)(i);

which sure looks much better, doesn't it ? The obvious drawback is the necessity for compatible signatures, but you can write wrappers around the functions for that. As long as you don't need to break/return, you'll be fine with this. Otherwise you would need a if in the loop body :D

Matthieu M.
A single cache miss might be important: this is realtime graphics code and an extreme inner loop (an inner loop of an inner loop thats part of the large frame loop), so it has to be fast.The functor approach is actually a solution vs using macros in order to do this, so I am grateful. I am optimizing code that already makes this optimization with macros, so knowing if its nessesary to make the code clearer is my primary purpose.Thanks! If it's too slow I'll use functors!
But won't the functor approach introduce function-call overhead (whereas the original approach can inline the functions if appropriate)
Jeremy Friesner
Thats true, but when you have many permutations of conditions, its alot faster to do one function call than to test all those conditions (at least in terms of number of things to do, I make no claim to what actually happens). For example, if instead of someCondition I had someSubCondition that depended on someCondition, then 8 conditions if someCondition was false, etc. etc.