views:

900

answers:

7

This may seem frivolous to some of you, but which of the following 2 methods of iteration over a STL container is better? Why?

class Elem;
typedef vector<Elem> ElemVec;
ElemVec elemVec;

// Method 0
for (ElemVec::iterator i = elemVec.begin(); i != elemVec.end(); ++i)
{
    Elem& e = *i;
    // Do something
}

// Method 1
for (int i = 0; i < elemVec.size(); ++i)
{
    Elem& e = elemVec.at(i);
    // Do something
}

Method 0 seems like cleaner STL, but Method 1 achieves the same with lesser code. Simple iteration over a container is what appears all over the place in any source code. So, I'm inclined to pick Method 1 which seems to reduce visual clutter and code size.

PS: I know iterators can do much more than a simple index. But, please keep the reply/discussion focused on simple iteration over a container like shown above.

+11  A: 

The first version works with any container and so is more useful in template functions that take any container a s a parameter. It is also conceivably slightly more efficient, even for vectors.

The second version only works for vectors and other integer-indexed containers. It'd somewhat more idiomatic for those containers, will be easily understood by newcomers to C++, and is useful if you need to do something else with the index, which is not uncommon.

As usual, there is no simple "this one is better" answer, I'm afraid.

anon
Thanks Neil. My code isn't usually working with templates but vectors whose type is known. Could you elaborate about why Method 0 is more efficient in your answer?
Ashwin
It may be slightly more efficient if the iterator is actually implemented directly as a pointer. Note use of words "may" and "slightly" - you need to measure to be sure.
anon
Yes, indeed the iterator is nothing more than a pointer for most containers. But, how does that make the code faster? AFAIK even Method 1 ends up being pointer arithmetic, isn't it?
Ashwin
@ash yes, but there is the arithmetic to do (ptr+index) as well as (index++), not to mention that [] may be a function call if it has not been inlined. But like I said - you need to measure.
anon
For the record I've never seen a measurable difference in performance.
Robert Gould
The indexed (at) method 0 will be considerably slower for a list, correct? If it's implemented as a linked list it's O(N).
Andrew Jaffe
@andrew correct - I keep forgetting list supports indexed access, as I can't imagine anyone being mad enough to use it :-)
anon
@neil actually, I don't think list does implement at() or operator[], after checking the docs online. But even size() is O(N).
Andrew Jaffe
@andrew You are right - shows how often I use list i.e. never!
anon
Yeah my no performance difference applies to vector versus vector only. Anyways I do use maps for performance oblivious code, in that case only option is method 0
Robert Gould
+4  A: 

Some more advantages of method 0:

  • if you move from vector to another container the loop remains the same,
  • easy to move from iterator to const_iterator if you need,
  • when c++0x will arive, auto typing will reduce some of the code clutter.

The main disadvantage is that in many cases you scan two containers, in which case an index is cleaner than keeping two iterators.

David Lehavi
Thanks David. I guess you meant Method 0 ;-)
Ashwin
Yes David, so could you please edit your answer to reflect that? Thanks :-)
Ashwin
Done. Sorry to edit your post, but it confused me. ;)
jalf
+7  A: 

If you don't mind a (very?) small loss of efficiency, i'd recommend using Boost.Foreach

BOOST_FOREACH( Elem& e, elemVec )
{
    // Your code
}
Benoît
I'm mostly a Boost illiterate right now. Thanks for this tip Benoît! This is a keeper :-)
Ashwin
+1, Benoît, I have loops all over the place and BOOST_FOREACH keeps me sane after using other languages with true foreach support
Phil Nash
C++ has true foreach support too: std::for_each. The syntax is just a bit different ;)
jalf
Jalf: STL has the for_each, but that is hardly the way most loops are used, since it requires another function to be defined.
Ashwin
When lambda comes with C++09 stl::foreach will be quite useful
Robert Gould
yeah c++1x will contains a native for-each loop anyway for(Elem working even on arrays without those begin/end messes
Johannes Schaub - litb
I'm personally against using macros in C++... IMHO very dangerous practice...
bgee
+4  A: 

Method 0 is faster and therefore recommended.

Method 1 uses size() which is allowed to be O(1), depending on the container and the stl implementation.

Stefan
Thanks Stefan, I had forgotten about size() :-)
Ashwin
Don't you mean `O(N)`?
Andrew Jaffe
size() should be O(1) in a standard compliant vector. And it is not less efficient than v.end() as it will probably be inlined. Efficiency here is just the same (no more than a couple of instructions difference per access)
David Rodríguez - dribeas
+1  A: 

It depends on which type of container. For a vector, it probably doesn't matter which you use. Method 0 has become more idiomatic, but their isn't a big difference, as everyone says.

If you decided to use a list, instead, method 1 would, in principle, be O(N), but in fact there is no list at() method, so you can't even do it that way. (So at some level your question stacks the deck.)

But that in itself is another argument for method 0: it uses the same syntax for different containers.

Andrew Jaffe
+2  A: 

Coincidentally I made a speed test recently (filling 10 * 1024 * 1024 ints with rand() ).
These are 3 runs, time in nano-seconds

vect[i] time      : 373611869  
vec.at(i) time    : 473297793  
*it = time        : 446818590  
arr[i] time       : 390357294  
*ptr time         : 356895778

UPDATE : added stl-algorithm std::generate, which seems to run the fastest, because of special iterator-optimizing (VC++2008). time in micro-seconds.

vect[i] time      : 393951
vec.at(i) time    : 551387
*it = time        : 596080
generate = time   : 346591
arr[i] time       : 375432
*ptr time         : 334612

Conclusion : Use standard-algorithms, they might be faster than a explicit loop ! (and also good practice)

Update : the above times were in a I/O-bound situation, I did the same tests with a CPU-bound (iterate over a relatively short vector, which should fit in cache repeatedly, multiply each element by 2 and write back to vector)

//Visual Studio 2008 Express Edition
vect[i] time      : 1356811
vec.at(i) time    : 7760148
*it = time        : 4913112
for_each = time   : 455713
arr[i] time       : 446280
*ptr time         : 429595

//GCC
vect[i] time      : 431039
vec.at(i) time    : 2421283
*it = time        : 381400
for_each = time   : 380972
arr[i] time       : 363563
*ptr time         : 365971

Interestingly iterators and operator[] is considerably slower in VC++ compared to for_each (which seems to degrade the iterators to pointers through some template-magic for performance).
In GCC access times are only worse for at(), which is normal, because it's the only range-checked function of the tests.

qwerty
Barely any statistical difference. Anything about 10% won't have an impact when an actual program is actually in use unless this is in a freqiently used tight loop. A cache miss, and the time would be equal
Robert Gould
If you define #define _SECURE_SCL 0#define _SECURE_SCL_THROWS 0there will be no difference between pointer and iterator performance.
Vadim Ferderer
+1  A: 

Method 0, for several reasons.

  • It better expresses your intent, which aids compiler optimizations as well as readability
  • Off-by-one errors are less likely
  • It works even if you replace the vector with a list, which doesn't have operator[]

Of course, the best solution will often be solution 2: One of the std algorithms. std::for_each, std::transform, std::copy or whatever else you need. This has some further advantages:

  • It expresses your intent even better, and allows some significant compiler optimizations (MS's secure SCL performs bounds checking on your methods 0 and 1, but will skip it on std algorithms)
  • It's less code (at the call site, at least. Of course you have to write a functor or something to replace the loop body, but at the use site, the code gets cleaned up quite a bit, which is probably where it matters most.

In general, avoid overspecifying your code. Specify exactly what you want done, and nothing else. The std algorithms are usually the way to go there, but even without them, if you don't need the index i, why have it? Use iterators instead, in that case.

jalf