ansaurus

Question

Answer 1

+5 A:

The first one will probably almost always be faster, but if you think this will make a difference, always profile first to see which is faster and by how much.

The compiler will probably be able to inline the call to end() in both cases, although if end() is sufficiently complicated, it may opt not to inline it. However, the key optimization is whether or not the compiler can perform loop-invariant code motion. I would posit that in most cases, the compiler can't be certain that the value of end() won't change during the iteration of the loop, in which case it has no choice but to call end() after each iteration.

Adam Rosenfield 2009-04-28 02:22:41

I agree. It is better to write easy-to-read code first. Then, if there are any performance problems - profile the code, make sure that loop condition is bottleneck, and only then rewrite into faster, but slightly less readable version.

Glorphindale 2009-04-28 02:41:33

Timing both approaches is a good idea. You *don't* know the first will be faster, since each call to end() will almost certainly be inlined into a single pointer comparison. Also, the C++ standard guarantees that calling end() on any container is a constant-time operation, so it can never be "sufficiently complicated".

j_random_hacker 2009-04-28 11:20:15

Answer 2

+4 A:

I would choose the option that is most concise and readable. Don't try to second guess the compiler and the optimisations it might perform. Remember that the vast majority of your code will have absolutely no effect on overall performance, so only if this is in a performance critical section of code should you spend the time to profile it and choose an appropriately efficient source representation.

With specific reference to your example, the first version makes a copy of the end() iterator, invoking whatever code runs for the copy constructor of the iterator object. STL containers generally contain inline end() functions, so the compiler has plenty of opportunity to optimise the second version even if you're not trying to help it out. Which one is best? Measure them.

Greg Hewgill 2009-04-28 02:32:26

Answer 3

A:

In theory the compiler could optimise the second version into the first (assuming the container doesn't change during the loop, obviously).

In practice, I've found several similar cases when profiling time-critical code where my compiler has totally failed to hoist invariant calculations out of loop conditions. So while the slightly more concise version is fine in most cases, I don't rely on the compiler doing sensible things with it for a case where I'm really concerned about performance.

Peter 2009-04-28 02:45:21

I think the issue is not whether the compiler is smart enough to detect that end() is invariant and hoist it out of the loop (which requires a relatively smart compiler) -- it's whether the call to end() can be inlined (which does not require such a smart compiler), since the code inside end() will typically be very short and simple, e.g. a single pointer comparison for std::vector or std::list.

j_random_hacker 2009-04-30 03:19:40

Answer 4

+2 A:

You can make the first version more concise and get the best of both:

for( const_iterator it = list.begin(), ite = list.end();
     it != ite; ++it)

P.S. The iterators aren't const, they're iterators to a const reference. There's a big difference.

Mark Ransom 2009-04-28 02:46:05

Answer 5

+12 A:

The two versions are not the same though. In the second version it compares the iterator against list.end() every time, and what list.end() evaluates to might change over the course of the loop. Now of course, you cannot modify list through the const_iterator it; but nothing prevents code inside the loop from just calling methods on list directly and mutating it, which could (depending on what kind of data structure list is) change the end iterator. It might therefore be incorrect in some circumstances to store the end iterator beforehand, because that may no longer be the correct end iterator by the time you get to it.

newacct 2009-04-28 02:50:36

+1. Good point about the possibility of the list changing, which will only be handled correctly by the second code snippet.

j_random_hacker 2009-04-28 11:09:32

if list is a std::vector, for example, changing it inside loop will invalidate all iterators, thus making both loops incorrect.

n0rd 2009-04-30 08:30:06

Answer 6

+1 A:

I always preferred the first one. Though with inline functions, compiler optimizations and relatively smaller container size ( in my case it's normally max 20-25 items) it really does not make any large difference with respect to performace.

const_iterator it = list.begin();
const_iterator endIt = list.end();

for(; it != endIt ; ++it)
{//do something
}

But recently I am using more of std::for_each wherever it's possible. Its optimized looping which helps to make the code look more readable than other two.

std::for_each(list.begin(), list.end(), Functor());

I will use the loop only when std::for_each cannot be used. (for ex: std::for_each does not allow you to break the loop unless an exception is thrown).

aJ 2009-04-28 02:58:43

I wasn't aware of this function. It seems like a function call will always have significant overhead compared with the built-in 'for', although it is very readable. Even with a macro implementation it couldn't be faster than a for loop.This makes me wish I could use python for this project (unfortunately my employer dictates c++).

Quantum7 2009-04-28 18:21:23

@Quantum7: Of course it can't be *faster* than a loop (because internally it is translated to a loop), but it is almost certainly no slower.

j_random_hacker 2009-04-30 03:21:04

Answer 7

+2 A:

Aah, people seem to be making guesses. Open up your code in the debugger & you will see that the calls to begin(), end() etc everything is optimized away. There is no need to use version 1. Tested with Visual C++ compiler fullopt.

2009-04-28 03:42:40

That's going to depend on the compiler, the container, and the optimization settings. Best to remove all doubt.

Mark Ransom 2009-04-28 03:48:20

It depends on the loop in question. I've found several cases in the past where MSVC++ doesn't optimise the second case into the first, even when it seems fairly obvious that it should.

Peter 2009-04-28 03:56:49

Answer 8

+4 A:

Consider this example:

for (const_iterator it = list.begin(); it != list.end(); ++list)
{
    if (moonFull())
        it = insert_stuff(list);
    else
        it = erase_stuff(list);
}

in this case, you NEED to call list.end() in the loop, and the compiler isn't going to optimize that away.

Other cases where the compiler can prove that end() is always returning the same value, optimization can take place.

If we are talking about STL containers, than i think any good compiler can optimize away multiple end() calls when multiple end() calls is not needed for the programming logic. However, if you have a custom container and implementation of end() is not in the same translation unit, than optimization will have to happen at link time. I know very little about link time optimization, but i'll bet most linkers will not do such optimization.

Shing Yip 2009-04-28 04:43:57

But if you're using an iterator to loop over the list, shouldn't you also use the iterator to modify the list? Otherwise you could get weird concurrency issues as the data and iterator get out of sync. Maybe that's a bigger issue in Java where iterators have more substance.

Quantum7 2009-04-28 18:56:17

Yes, you're correct. It will make more sense to write insert__stuff(it, list)...but the point i was trying to get across was the fact that list could change within the loop and list.end() must be called for every loop.

Shing Yip 2009-04-28 19:19:26

+1. Your point about inlining happening only when the definition of end() appears in the same translation unit makes perfect sense to me. I wonder if that's what others are experiencing when they complain about the compiler missing "obvious" inlining opportunities...?

j_random_hacker 2009-04-30 03:15:43

Answer 9

+7 A:

I'll just mention for the record that the C++ standard mandates that calling begin() and end() on any container type (be it vector, list, map etc.) must take only constant time. In practice, these calls will almost certainly be inlined to a single pointer comparison if you compile with optimisations turned on.

Note that this guarantee does not necessarily hold for additional vendor-supplied "containers" that do not actually obey the formal requirements of being a container laid out in the chapter 23 of the standard (e.g. the singly-linked list slist).

j_random_hacker 2009-04-28 11:24:36

++ From what I hear and see, the iterators sometimes/often don't get inlined to call-free code, and even if they take constant time, the time can be downright piggy. Of course, that may be fine, until you get into a stress situation, and then it can be your biggest cost. Moral: be aware of the possibility.

Mike Dunlavey 2009-04-29 13:54:22

@Mike: Shing Yip makes a good point -- inlining can only realistically happen for functions included in the same translation unit (e.g. via a header file). I'd be intrigued to see a code snippet where an STL container's end() is (reproducibly) not being inlined by a recent (< 5 years old) compiler.

j_random_hacker 2009-04-30 04:59:38

Answer 10

+1 A:

Sample it under stress conditions and see if you're in ** this code very often ***.
If not, it doesn't matter.
If you are, look at the disassembly, or single-step it.
That's how you can tell which is faster.

You gotta be careful of these iterators.
They may get optimized into nice machine code, but often enough they don't, and become time hogs.

** (Where "in" means actually in it, or being called from it.)

*** (Where "often" means a significant percentage of the time.)

ADDED: Don't just see how many times per second the code is executed. It could be 1,000 times a second and still be using less than 1% of the time.

Don't time how long it takes either. It could take a millisecond and still be using less than 1% of the time.

You could multiply the two, to get a better idea, but that only works if they're not too skewed.

Sampling the call stack will tell you if it uses a high enough percentage of time to matter.

Mike Dunlavey 2009-04-28 11:41:41

Answer 11

+2 A:

See http://stackoverflow.com/questions/755347/are-constiterators-faster for further discussion and benchmark results.

anon 2009-04-28 11:50:35

Answer 12

+1 A:

The compiler might be able to optimize the second into the first, but that assumes that the two are equivalent, i.e. end() actually is constant. A slightly more problematic issue is that the compiler may be unable to deduce that the end iterator is constant due to possible aliasing. However, assuming that the call to end() is inlined, the difference is just a memory load.

Note that this assumes that the optimizer is enabled. If the optimizer is not enabled, as is often done in debug builds, then the second formulation will involve N-1 more function calls. In current versions of Visual C++, debug builds will also incur additional hits due to function prolog/epilog checks and heavier debug iterators. Therefore, in STL heavy code, defaulting to the first case can prevent code from being disproportionately slow in debug builds.

Insertion and removal within the loop are a possibility, as others have pointed out, but with this style of loop I find that unlikely. For one thing, node-based containers -- list, set, map -- don't invalidate end() on either operation. Second, the iterator increment frequently has to be moved in-loop to avoid invalidation problems:

   // assuming list -- cannot cache end() for vector
   iterator it(c.begin()), end(c.end());
   while(it != end) {
       if (should_remove(*it))
           it = c.erase(it);
       else
           ++it;
   }

Thus, I consider a loop that claims to call end() for mutate-during-loop reasons and still has ++it in the loop header to be suspect.

Avery Lee 2009-04-28 22:30:50

ansaurus

tags:

views:

answers:

C++ iterators & loop optimization

related questions