views:

1635

answers:

14

There are apparently many ways to iterate over a collection. Curious if there are any differences, or why you'd use one way over the other.

First type:

List<string> someList = <some way to init>
foreach(string s in someList) {
   <process the string>
}

Other Way:

List<string> someList = <some way to init>
someList.ForEach(delegate(string s) {
    <process the string>
});

I suppose off the top of my head, that instead of the anonymous delegate I use above, you'd have a reusable delegate you could specify...

+5  A: 

i guess the someList.ForEach() call could be easily parallelized whereas the normal foreach is not that easy to run parallel. you could easily run several different delegates on different cores, which is not that easy to do with a normal foreach. just my 2 cents

Joachim Kerschbaumer
What code would you add to "easily parallelize" it?
Anthony
I think he meant that the runtime engine could parallelize it automatically. Otherwise, both foreach and .ForEach can be parallelized by hand using a thread from the pool in each action delegate
Isak Savo
A: 

You could name the anonymous delegate :-)

And you can write the second as:

someList.ForEach(s => s.ToUpper())

Which I prefer, and saves a lot of typing.

As Joachim says, parallelism is easier to apply to the second form.

Craig.Nicol
A: 

The second way you showed uses an extension method to execute the delegate method for each of the elements in the list.

This way, you have another delegate (=method) call.

Additionally, there is the possibility to iterate the list with a for loop.

EFrank
+1  A: 

Behind the scenes, the anonymous delegate gets turned into an actual method so you could have some overhead with the second choice if the compiler didn't choose to inline the function. Additionally, any local variables referenced by the body of the anonymous delegate example would change in nature because of compiler tricks to hide the fact that it gets compiled to a new method. More info here on how C# does this magic:

http://blogs.msdn.com/oldnewthing/archive/2006/08/04/688527.aspx

jezell
+1  A: 

Here is a post that compares the foreach vs for (Also read the comments of this post)

rudigrobler
That's a dead link for me today.
Anthony
+1  A: 

Consider the following article on list.foreach performance.

kenny
Excellent article, especially useful comment on Array.ForEach.
Dave Van den Eynde
+1  A: 

One thing to be wary of is how to exit from the Generic .ForEach method - see this discussion. Although the link seems to say that this way is the fastest. Not sure why - you'd think they would be equivalent once compiled...

Chris Kimpton
Excellen counterpoint!
Dave Van den Eynde
+13  A: 

List<T>.ForEach() is slightly faster—it accesses the internal array by index whereas a foreach statement uses an Enumerator:

  var list1 = Enumerable.Repeat(1,10000000).ToList();
  var list2 = new List<int>(); 
  var list3 = new List<int>();
  var sw    = new System.Diagnostics.Stopwatch();


  sw.Start();
  foreach(var x in list1) list2.Add(x);
  sw.Stop();
  Console.WriteLine(sw.ElapsedMilliseconds);

  sw.Reset();
  sw.Start();
  list1.ForEach(x => list3.Add(x));
  sw.Stop();
  Console.WriteLine(sw.ElapsedMilliseconds);

The above test gives these answers, give or take a few milliseconds:

280
230

Depending on how you vary the test, the differences may be bigger or smaller but the List<T>.ForEach() is consistently faster, even if negligibly so.

Mark Cidade
This is untrue, at least on my machine. Were you by any chance running in Debug mode? The `ForEach` version needs to call through a delegate each time, whereas the `List<T>`'s enumerator's `MoveNext` and `Current` members should be inlined by the JIT compiler, making the `for` loop faster.
kvb
+1  A: 

List.ForEach() is considered to be more functional.

List.ForEach() says what you want done. foreach(item in list) also says exactly how you want it done. This leaves List.ForEach free to change the implementation of the how part in the future. For example, a hypothetical future version of .Net might always run List.ForEach in parallel, under the assumption that at this point everyone has a number of cpu cores that are generally sitting idle.

On the other hand, foreach (item in list) gives you a little more control over the loop. For example, you know that the items will be iterated in some kind of sequential order, and you could easily break in the middle if an item meets some condition.

Joel Coehoorn
+28  A: 

There is one important, and useful, distinction between the two.

Because .ForEach uses a for loop to iterate the collection, this is valid:

someList.ForEach(x => if(x.RemoveMe) someList.Remove(x));

whereas foreach uses an enumerator, so this is not valid:

foreach(var item in someList)
  if(item.RemoveMe) someList.Remove(item);
Will
even then, you should use someList.RemoveAll(x => x.RemoveMe) instead
Mark Cidade
With Linq, all things can be done better. I was just showing an example of modifying the collection within foreach...
Will
RemoveAll() is a method on List<T>.
Mark Cidade
Noted! Anyhow, modifying a collection during enumeration is verboten, whether you're removing or adding to it.
Will
+9  A: 

For fun, I popped List into reflector and this is the resulting C#:

public void ForEach(Action<T> action)
{
    if (action == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
    }
    for (int i = 0; i < this._size; i++)
    {
        action(this._items[i]);
    }
}

Similarly, the MoveNext in Enumerator which is what is used by foreach is this:

public bool MoveNext()
{
    if (this.version != this.list._version)
    {
        ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
    }
    if (this.index < this.list._size)
    {
        this.current = this.list._items[this.index];
        this.index++;
        return true;
    }
    this.index = this.list._size + 1;
    this.current = default(T);
    return false;
}

The List.ForEach is much more trimmed down than MoveNext - far less processing - will more likely JIT into something efficient..

In addition, foreach() will allocate a new Enumerator no matter what. The GC is your friend, but if you're doing the same foreach repeatedly, this will make more throwaway objects, as opposed to reusing the same delegate - BUT - this is really a fringe case. In typical usage you will see little or no difference.

plinth
You have no guarantee that the code generated by foreach will be the same between compiler versions. The code generated may be improved by a future version.
Anthony
+4  A: 

We had some code here (in VS2005 and C#2.0) where the previous engineers went out of their way to use list.ForEach( delegate(item) { foo;}); instead of foreach(item in list) {foo; }; for all the code that they wrote. e.g. a block of code for reading rows from a dataReader.

I still don't know exactly why they did this.

The drawbacks of list.ForEach() are:

  • It is more verbose in VS2005 and C# 2.0. However, in C# 3, you can use the "=>" syntax to make some nicely terse expressions.

  • It is less familiar. People who have to maintain this code will wonder why you did it that way. It took me awhile to decide that there wasn't any reason, except maybe to make the writer seem clever (the quality of the rest of the code undermined that). It was also less readable, with the "})" at the end of the delegate code block.

  • See also Bill Wagner's book "Effective C#: 50 Specific Ways to Improve Your C#" where he talks about why foreach is preferred to other loops like for or while loops - the main point is that you are letting the compiler decide the best way to construct the loop. If a future version of the compiler manages to use a faster technique, then you will get this for free by using foreach and rebuilding, rather than changing your code.

  • a foreach(item in list) construct allows you to use break or continue if you need to exit the iteration or the loop. But you cannot alter the list inside a foreach loop.

I'm surprised to see that list.ForEach is slightly faster. But that's probably not a valid reason to use it throughout, that would be premature optimisation. If your application uses a database or web service that, not loop control, is almost always going to be be where the time goes.

I disagree that the list.foreach(delegate) version is "more functional". This might look superficially more like how a functional language would do it, but there's no big difference in what happens.

I don't think that foreach(item in list) "says exactly how you want it done" - a for(int 1 =0; i < count; i++) loop does that, a foreach loop leaves the choice of control up to the compiler.

My feeling now would be, on a new project, to use foreach(item in list) for most loops in order to adhere to the common usage and for readability, and use list.Foreach() only for short blocks, when you can do something more elegantly or compactly with the C#3 "=>" operator. In cases like that, there may already be a LINQ extension method that is preferable to ForEach()

Anthony
No idea why it was a -1 as I find it an excellent answer.
Dave Van den Eynde
+5  A: 

I know two obscure-ish things that make them different. Go me!

Firstly, there's the classic bug of making a delegate for each item in the list. If you use the foreach keyword, all your delegates can end up referring to the last item of the list:

    // A list of actions to execute later
    List<Action> actions = new List<Action>();

    // Numbers 0 to 9
    List<int> numbers = Enumerable.Range(0, 10).ToList();

    // Store an action that prints each number (WRONG!)
    foreach (int number in numbers)
        actions.Add(() => Console.WriteLine(number));

    // Run the actions, we actually print 10 copies of "9"
    foreach (Action action in actions)
        action();

    // So try again
    actions.Clear();

    // Store an action that prints each number (RIGHT!)
    numbers.ForEach(number =>
        actions.Add(() => Console.WriteLine(number)));

    // Run the actions
    foreach (Action action in actions)
        action();

The List.ForEach method doesn't have this problem. The current item of the iteration is passed by value as an argument to the outer lambda, and then the inner lambda correctly captures that argument in its own closure. Problem solved.

(Sadly I believe ForEach is a member of List, rather than an extension method, though it's easy to define it yourself so you have this facility on any enumerable type.)

Secondly, the ForEach method approach has a limitation. If you are implementing IEnumerable by using yield return, you can't do a yield return inside the lambda. So looping through the items in a collection in order to yield return things is not possible by this method. You'll have to use the foreach keyword and work around the closure problem by manually making a copy of the current loop value inside the loop.

More here

Daniel Earwicker
A: 

Eric Lippert has a blogpost about the exact problem.

Yacoder