views:

646

answers:

6

Which is the most efficient way to traverse a collection?

List<Integer>  a = new ArrayList<Integer>();
for (Integer integer : a) {
  integer.toString();
}

or

List<Integer>  a = new ArrayList<Integer>();
for (Iterator iterator = a.iterator(); iterator.hasNext();) {
   Integer integer = (Integer) iterator.next();
   integer.toString();
}

Please note, that this is not an exact duplicate of this, this, this, or this, although one of the answers to the last question comes close. The reason that this is not a dupe, is that most of these are comparing loops where you call get(i) inside the loop, rather than using the iterator.

As suggested on Meta I will be posting my answer to this question.

+20  A: 

If you are just wandering over the collection to read all of the values, then there is no difference to using an iterator or the new for loop syntax, since the new syntax just uses the iterator underwater.

If however, you mean by loop the old "c-style" loop:

for(int i=0; i<list.size(); i++) {
   Object o = list.get(i);
}

Then the new for loop, or iterator, can be a lot more efficient, depending on the underlying data structure. The reason for this is that for some data structures, get(i) is an O(n) operation, which makes the loop O(n2) operation. All iterators have as a fundamental requirement that next() should be an O(1) operation, making the loop O(n).

To verify that the iterator is used underwater by the new for loop syntax, compare the generated bytecodes from the following Java, first the for loop:

List<Integer>  a = new ArrayList<Integer>();
for (Integer integer : a)
{
  integer.toString();
}
// Byte code
 ALOAD 1
 INVOKEINTERFACE java/util/List.iterator()Ljava/util/Iterator;
 ASTORE 3
 GOTO L2
L3
 ALOAD 3
 INVOKEINTERFACE java/util/Iterator.next()Ljava/lang/Object;
 CHECKCAST java/lang/Integer
 ASTORE 2 
 ALOAD 2
 INVOKEVIRTUAL java/lang/Integer.toString()Ljava/lang/String;
 POP
L2
 ALOAD 3
 INVOKEINTERFACE java/util/Iterator.hasNext()Z
 IFNE L3

And now for the iterator:

List<Integer>  a = new ArrayList<Integer>();
for (Iterator iterator = a.iterator(); iterator.hasNext();)
{
  Integer integer = (Integer) iterator.next();
  integer.toString();
}
// Bytecode:
 ALOAD 1
 INVOKEINTERFACE java/util/List.iterator()Ljava/util/Iterator;
 ASTORE 2
 GOTO L7
L8
 ALOAD 2
 INVOKEINTERFACE java/util/Iterator.next()Ljava/lang/Object;
 CHECKCAST java/lang/Integer
 ASTORE 3
 ALOAD 3
 INVOKEVIRTUAL java/lang/Integer.toString()Ljava/lang/String;
 POP
L7
 ALOAD 2
 INVOKEINTERFACE java/util/Iterator.hasNext()Z
 IFNE L8

As you can see, the generated byte code is identical, so there is no performance penalty to using either form. Therefore, you should choose the form of loop that is most aesthetically appealing to you, for most people that will be the for-each loop, as that has a lot less boilerplate code.

Paul Wagland
"Then it can be a lot more efficient, depending on the underlying data structure." - it's unclear why foo.get(i) will be more efficient than the iterator approach, care to explain?
Mark E
@Mark: I have updated the answer to clarify that the foo.get(i) would not be more efficient. Thanks for the heads up.
Paul Wagland
Thanks -- the original answer suggested the opposite of your clarification, what you've written makes far more sense.
Mark E
I believe he was saying the opposite, that foo.get(i) can be a lot less efficient. Think of LinkedList. If you do a foo.get(i) on the middle of a LinkedList it has to traverse all the previous nodes to get to i. An iterator, on the other hand, will keep a handle to the underlying data structure and will allow you to walk over the nodes one at a time.
Michael Krauklis
@Michael: The answer as it currently stands makes that clear, originally however it was not so well worded, I updated it after Mark's comment to clarify the answer.
Paul Wagland
A: 

I don't think there is a difference but I've never done a technical comparisons. Here is my logic. The for (Integer integer : a) syntax utilizes the iterator, so in effect this should be the same code... The cast, however, might slow down the second one slightly...

Frank V
the foreach also does the cast.
Carlos Heuberger
+1  A: 

Your first snippet is just syntactic sugar for the second. You'll get the same code.

Laurence Gonsalves
+3  A: 

The difference isn't in performance, but in capability. When using a reference directly you have more power over explicitly using a type of iterator (e.g. List.iterator() vs. List.listIterator(), although in most cases they return the same implementation). You also have the ability to reference the Iterator in your loop. This allows you to do things like remove items from your collection without getting a ConcurrentModificationException.

e.g.

This is ok:

Set<Object> set = new HashSet<Object>();
// add some items to the set

Iterator<Object> setIterator = set.iterator();
while(setIterator.hasNext()){
     Object o = setIterator.next();
     if(o meets some condition){
          setIterator.remove();
     }
}

This is not, as it will throw a concurrent modification exception:

Set<Object> set = new HashSet<Object>();
// add some items to the set

for(Object o : set){
     if(o meets some condition){
          set.remove(o);
     }
}
Michael Krauklis
This is very true, even though it doesn't directly answer the question I have given it +1 for being informative, and answering the logical follow-on question.
Paul Wagland
+2  A: 

To expand on Paul's own answer, he has demonstrated that the bytecode is the same on that particular compiler (presumably Sun's javac?) but different compilers are not guaranteed to generate the same bytecode, right? To see what the actual difference is between the two, let's go straight to the source and check the Java Language Specification, specifically 14.14.2, "The enhanced for statement":

The enhanced for statement is equivalent to a basic for statement of the form:

for (I #i = Expression.iterator(); #i.hasNext(); ) {
    VariableModifiers(opt) Type Identifier = #i.next();    
    Statement 
}

In other words, it is required by the JLS that the two are equivalent. In theory that could mean marginal differences in bytecode, but in reality the enhanced for loop is required to:

  • Invoke the .iterator() method
  • Use .hasNext()
  • Make the local variable available via .next()

So, in other words, for all practical purposes the bytecode will be identical, or nearly-identical. It's hard to envisage any compiler implementation which would result in any significant difference between the two.

Cowan
Actually, the test I did was with the Eclipse compiler, but your general point still stands. +1
Paul Wagland
+1  A: 

Does it really matter...? The first one is shorter and more readable, that is all I would care about in this situation.

Kristian