views:

216

answers:

7

Generally speaking is it a good idea to cache an end iterator (specifically STL containers) for efficiency and speed purposes? such as in the following bit of code:

std::vector<int> vint;
const std::vector<int>::const_iterator end = vint.end();
std::vector<int>::iterator it = vint.begin();

while (it != end)
{
   ....
   ++it;
}

Under what conditions would the end value be invalidated? would erasing from the container cause end to be invalidated in all STL containers or just some?

+2  A: 

Erasing from a container over which you are currently iterating is always a bad idea. The actual caching of your end iterator is not going to change that.

h.

haavee
Not at all, if you do it properly. The 'erase' member of std::vector, for instance, returns an appropriate new iterator. So as long as you write your code correctly (that is, remember that other iterators are now invalid) you will have no problems erasing from a container you are iterating over.
Michael Kohne
+6  A: 

If we are talking about efficiency and speed: caching the end iterator is unnecessary because of compiler optimizations and inlining.

Sergius
Is it really ? Most of the advices I have seen preaching for using the STL algorithms (like `for_each`) were that it would cache the `end` iterator so it doesn't get computed at each iteration.
Matthieu M.
@Matthieu: iterators are not guaranteed to survive modification of some containers (e.g. resizing vector invalidates *all* iterators to the vector). for_each doesn't give function object access to the iterator, the container can't be changed, thus it is OK to cache the iterator. but if you trick the systems and from the function object modify the container in a way that invalidates the iterators, then obviously it would lead to *undefined behavior*, likely crash, when accessing container over the now invalid iterator.
Dummy00001
Actually, resizing a vector only invalidates the iterators if a reallocation occurs, and you can make sure it doesn't if you resize only up to the capacity. But the point of my question remains, I am unsure that the compiler is able to optimize the computation of `end()` out of the loop, even if inlined the compiler as a priori no way of knowing if the value is likely to get modified by the operations within the loop as far as I know. But then I don't know much about compiler optmizations, thus my question :)
Matthieu M.
@Matthieu: You can take a look at generated code for your loop (http://stackoverflow.com/questions/840321/how-can-i-see-the-assembly-code-for-a-c-program). There you will find the difference between various approaches.
Sergius
+3  A: 

I often use this style for iterating containers:

// typedef std::vector<Person> Persons;
Persons::iterator it = persons.begin(), end = persons.end();
for (; it != end; ++it)
{
    Person & person = *it;
    // ...
}

Erasing an element from a vector invalidates all iterators after the erased position.

I'm not certain about the other container types. In any case I think it would be safe to assume that all iterators become invalid after an erase. If you really need very specific information then you can always look it up. I rarely need this because because of my rather conservative coding style.

StackedCrooked
+3  A: 

In general it should not matter if you cache the end iterator. If you feel it does matter, then you should already be using a profiler on your code and would be able to profile both variations. I'd suspect it could differ depending upon the container type - but a profiler would be the only way to know for sure given your compiler, optimizations and STL vendor.

Stephen Nutt
+4  A: 

Generally speaking is it a good idea to cache an end iterator (specifically STL containers) for efficiency and speed purposes?

If you use the STL container algorithms the caching of the end iterator is happens anyway (as you pass the result of container.end() as a parameter).

If you modify the memory of the container (insert/remove elements) it's a bad bad idea.

Also, caching for efficiency rarely makes much sense: in most cases the end() is inlined by the compiler, and when it is not, it is very probable that your efficiency doesn't hang on the end() result being cached. YMMV though.

utnapistim
+9  A: 

In the simple case of a vector, the end iterator will change when you add or remove elements from the container; though, it's usually safest to assume that if you mutate the container while iterating over it, all iterators to it become invalid. Iterators may be implemented differently in any given STL implementation.

With regard to caching the end iterator -- it's certainly valid to cache it, but to find out if it is actually faster in your case, the best bet is for you to profile your code and see. While retrieving the end iterator from a vector is likely a fast implementation with a recent STL library and compiler, I have worked on past projects where caching the end iterator gave us a significant speed boost. (This was on the PlayStation 2, so do take with a grain of salt.)

Blair Holloway
+3  A: 

The invalidation rules (for iterators) is defined very explicitly for each type of container. I find the SGI site very useful http://www.sgi.com/tech/stl/table_of_contents.html

Specifically for vectors I find:

[5] A vector's iterators are invalidated when its memory is reallocated. Additionally, inserting or deleting an element in the middle of a vector invalidates all iterators that point to elements following the insertion or deletion point. It follows that you can prevent a vector's iterators from being invalidated if you use reserve() to preallocate as much memory as the vector will ever use, and if all insertions and deletions are at the vector's end.

Martin York