ansaurus

Question

Answer 1

A:

If list is an array it's exactly the same. the second is translated to the first. However if list is a linked list, the second is much faster!

Clangon 2009-07-14 11:18:52

Not unless "list" is an array.

Jon Skeet 2009-07-14 11:19:19

...which is not

2009-07-14 11:21:41

@runtime: How do you know? We can't see the declaration of "list". It seems unlikely, but you don't know for sure.

Jon Skeet 2009-07-14 11:22:59

You're right, I just realized it's pseudo. I was trying to emphasize that the compiler generated code isn't the same.

2009-07-14 12:03:08

You're right, I thought it was an array, and in that case it really is exactly the same.

Clangon 2009-07-14 12:56:34

Anyone else want to vote me down? :(

Clangon 2009-07-14 13:09:18

-1 -> +1, since you edited.

Bastien Léonard 2009-07-14 15:36:26

If it's a linked list there's no indexer...

Jon Skeet 2009-07-17 06:42:31

Answer 2

+52 A:

Well, it partly depends on the exact type of list. It will also depend on the exact CLR you're using.

Whether it's in any way significant or not will depend on whether you're doing any real work in the loop. In almost all cases, the difference to performance won't be significant, but the difference to readability favours the foreach loop.

I'd personally use LINQ to avoid the "if" too:

foreach (var item in list.Where(condition))
{
}

EDIT: For those of you who are claiming that iterating over a List<T> with foreach produces the same code as the for loop, here's evidence that it doesn't:

static void IterateOverList(List<object> list)
{
    foreach (object o in list)
    {
        Console.WriteLine(o);
    }
}

Produces IL of:

.method private hidebysig static void  IterateOverList(class [mscorlib]System.Collections.Generic.List`1<object> list) cil managed
{
  // Code size       49 (0x31)
  .maxstack  1
  .locals init (object V_0,
           valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object> V_1)
  IL_0000:  ldarg.0
  IL_0001:  callvirt   instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<!0> class [mscorlib]System.Collections.Generic.List`1<object>::GetEnumerator()
  IL_0006:  stloc.1
  .try
  {
    IL_0007:  br.s       IL_0017
    IL_0009:  ldloca.s   V_1
    IL_000b:  call       instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>::get_Current()
    IL_0010:  stloc.0
    IL_0011:  ldloc.0
    IL_0012:  call       void [mscorlib]System.Console::WriteLine(object)
    IL_0017:  ldloca.s   V_1
    IL_0019:  call       instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>::MoveNext()
    IL_001e:  brtrue.s   IL_0009
    IL_0020:  leave.s    IL_0030
  }  // end .try
  finally
  {
    IL_0022:  ldloca.s   V_1
    IL_0024:  constrained. valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>
    IL_002a:  callvirt   instance void [mscorlib]System.IDisposable::Dispose()
    IL_002f:  endfinally
  }  // end handler
  IL_0030:  ret
} // end of method Test::IterateOverList

The compiler treats arrays differently, converting a foreach loop basically to a for loop, but not List<T>. Here's the equivalent code for an array:

static void IterateOverArray(object[] array)
{
    foreach (object o in array)
    {
        Console.WriteLine(o);
    }
}

// Compiles into...

.method private hidebysig static void  IterateOverArray(object[] 'array') cil managed
{
  // Code size       27 (0x1b)
  .maxstack  2
  .locals init (object V_0,
           object[] V_1,
           int32 V_2)
  IL_0000:  ldarg.0
  IL_0001:  stloc.1
  IL_0002:  ldc.i4.0
  IL_0003:  stloc.2
  IL_0004:  br.s       IL_0014
  IL_0006:  ldloc.1
  IL_0007:  ldloc.2
  IL_0008:  ldelem.ref
  IL_0009:  stloc.0
  IL_000a:  ldloc.0
  IL_000b:  call       void [mscorlib]System.Console::WriteLine(object)
  IL_0010:  ldloc.2
  IL_0011:  ldc.i4.1
  IL_0012:  add
  IL_0013:  stloc.2
  IL_0014:  ldloc.2
  IL_0015:  ldloc.1
  IL_0016:  ldlen
  IL_0017:  conv.i4
  IL_0018:  blt.s      IL_0006
  IL_001a:  ret
} // end of method Test::IterateOverArray

Interestingly, I can't find this documented in the C# 3 spec anywhere...

Jon Skeet 2009-07-14 11:18:57

Out of interest Jon, the scenario with List<T> above ... does that apply to other collections also? Also, how did you know this (with no malice intended whatsoever) ... as in .. did u literally stumble across this while trying to answer this question, previously some time ago? It's so ... random / secret :)

Pure.Krome 2009-07-14 11:56:31

I've been aware of the array optimisations for a while - arrays are a "core" kind of collection; the C# compiler is already deeply aware of them, so it makes sense for it to treat them differently. The compiler doesn't (and shouldn't) have any special knowledge of `List<T>`.

Jon Skeet 2009-07-14 11:59:18

Cheers :) and yeah ... array's were the first collection concept I was taught years and years ago at uni .. so it would make sence that the compiler is smart enough to deal with one of (if not the) most primitive type of collection. cheers again!

Pure.Krome 2009-07-14 12:06:03

I'm definitely speculating here, but I'd imagine the reason you're not finding it in the spec is because it's a compiler optimization rather than a language specification. If this is true, I'd say that the *real* answer is implementation dependent.

senfo 2009-07-14 13:12:10

@senfo: The strange thing is that the spec implies that the generated code *should* use the iterator too.Normally for something like this the spec explicitly says that it could be optimized or not.

Jon Skeet 2009-07-14 13:34:35

Compare this assessment: http://stackoverflow.com/questions/225937/foreach-vs-somelist-foreach/226094#226094

plinth 2009-07-14 14:13:56

Answer 3

A:

Dont worry about it- they are identical.

Do you really think that the team at microsoft are so silly that they would make for each that was slower than the equivalent for ?

mP 2009-07-14 11:19:42

It's not a case of silliness - it's a case of feasible efficiency. They're *not* identical... but the difference is very rarely going to be significant.

Jon Skeet 2009-07-14 11:21:34

Actually, because of the way foreach is implemented (using enumerators and such), it is trivially slower. For any real application it isn't much though.

Matthew Scharley 2009-07-14 11:22:03

The question was "Which one is better"? Not that "whether foreach is slower than for or not"

Aamir 2009-07-14 11:22:17

"Can you tell me which code snippet will give better perfomance?", the question was squarely about performance, ie. speed.

Matthew Scharley 2009-07-14 11:26:36

@js I'd bet the poster doesn't understand why enumerators are a bit heavier than using an index on a random accessable list. There seem to be a lot of these nonsense performace q from people who think someone minor syntactically equivalent constructs are amazingly faster. No matter what you say they would not appreciate.

mP 2009-07-14 14:02:50

My answer was given to keep things in perspective. Running the above code fragment once or a few times here and there in the overall scheme of things makes no diff, after all a zillion other things are happening inside the computer.

mP 2009-07-14 14:05:39

@mP: I'd say there's little point in providing perspective if you're going to be inaccurate at the same time. They're *not* identical, and claiming they are is simply misleading.

Jon Skeet 2009-07-14 15:22:47

@mP I disagree that this is a nonsense performance question. Sure, this sort of optimization may not be worth it most of the time, but asking the sort of question gives you a deeper understanding of the language.

cdmckay 2009-07-14 23:12:43

mP 2009-07-15 21:38:39

@mP: Sure - and if you'd just said that, I'm sure you wouldn't have been dowvoted.

Jon Skeet 2009-07-17 06:43:48

@JSIn the end, though my answer may literally be in accurate in some respect, its less helpful to info overload the newbie. At least with my answer they would forget worrying and move onto the next task. So many of the answers introduced more pointers but in the end as a newbie i would be more confused and unsure what i should do next. If one wants to get technical many of the answers are also wrong in way or another.

mP 2009-07-17 08:51:01

@JS - part 2/2In the end your boss wants you to finish your task, developer time is expensive and its just not worth wasting minutes, hours, days to ensure said loop executes a fraction of second quicker. Anyway given that its an List as in interface and not a Class one can never be sure what the performance characteristics of using an index or enumerator etc. Perhaps its more important to note that one shouldnt care. To start worrying about such things is completely against the point of interfaces and OO.

mP 2009-07-17 08:51:34

@JS - part3/3I would argue attempting to test for what type of List is being passed back is actually an anti pattern and one of the dumbest things to attempt.KISS.

mP 2009-07-17 08:52:22

Answer 4

+9 A:

Unless your app is very performance critical I would not worry about this. Far better to have clean and easily understandable code.

Fortyrunner 2009-07-14 11:23:51

Answer 5

+1 A:

Like other people have mentioned though the performance doesn't actually matter much, the foreach will always be a little bit slower because of the IEnumerable/IEnumerator usage in the loop. The compiler translates the construct into calls on that interface and for every step a function + a property are called in the foreach construct.

IEnumerator iterator = ((IEnumerable)list).GetEnumerator(); while (iterator.MoveNext()) { var item = iterator.Current; // do stuff }

This is the equivalent expansion of the construct in C#. You can imagine how the performance impact can vary based on the implementations of MoveNext and Current. Whereas in an array access, you don't have that dependencies.

Charles Prakash Dasari 2009-07-14 11:43:22

Don't forget there's a difference between an array access and an indexer access. If list is a `List<T>` here then there's still the hit (possibly inlined) of calling the indexer. It's not like it's a bare metal array access.

Jon Skeet 2009-07-14 11:47:23

Very true! It is yet another property execution and we are at the mercy of the implementation.

Charles Prakash Dasari 2009-07-14 16:59:08

Answer 6

+4 A:

An easy test to semi-validate. I did a small test, just to see. Here is the code:

    static void Main(string[] args)
    {
        List<int> intList = new List<int>();

        for (int i = 0; i < 10000000; i++)
        {
            intList.Add(i);
        }

        DateTime timeStarted = DateTime.Now;
        for (int i = 0; i < intList.Count; i++)
        {
            int foo = intList[i] * 2;
            if (foo % 2 == 0)
            {
            }
        }

        TimeSpan finished = DateTime.Now - timeStarted;

        Console.WriteLine(finished.TotalMilliseconds.ToString());
        Console.Read();

    }

And here is the foreach section:

        foreach (int i in intList)
        {
            int foo = i * 2;
            if (foo % 2 == 0)
            {
            }
        }

When I replaced the for with a foreach -- the foreach was 20 milliseconds faster -- consistently. The for was 135-139ms while the foreach was 113-119ms. I swapped back and forth several times, making sure it wasn't some process that just kicked in.

However, when I removed the foo and the if statement, the for was faster by 30 ms (foreach was 88ms and for was 59ms). They were both empty shells. I'm assuming the foreach actually passed a variable where as the for was just incrementing a variable. If I added

int foo = intList[i];

Then the for become slow by about 30ms. I'm assuming this had to do with it creating foo and grabbing the variable in the array and assigning it to foo. If you just access intList[i] then you don't have that penalty.

In all honesty.. I expected the foreach to be slightly slower in all circumstances, but not enough to matter in most applications.

edit: here is the new code using Jons suggestions (134217728 is the biggest int you can have before System.OutOfMemory exception gets thrown):

    static void Main(string[] args)
    {
        List<int> intList = new List<int>();

        Console.WriteLine("Generating data.");
        for (int i = 0; i < 134217728 ; i++)
        {
            intList.Add(i);
        }

        Console.Write("Calculating for loop:\t\t");

        Stopwatch time = new Stopwatch();
        time.Start();
        for (int i = 0; i < intList.Count; i++)
        {
            int foo = intList[i] * 2;
            if (foo % 2 == 0)
            {
            }
        }

        time.Stop();
        Console.WriteLine(time.ElapsedMilliseconds.ToString() + "ms");
        Console.Write("Calculating foreach loop:\t");
        time.Reset();
        time.Start();

        foreach (int i in intList)
        {
            int foo = i * 2;
            if (foo % 2 == 0)
            {
            }
        }

        time.Stop();

        Console.WriteLine(time.ElapsedMilliseconds.ToString() + "ms");
        Console.Read();
    }

And here are the results:

Generating data. Calculating for loop: 2458ms Calculating foreach loop: 2005ms

Swapping them around to see if it deals with the order of things yields the same results (nearly).

Nazadus 2009-07-14 12:01:38

It's better to use Stopwatch than DateTime.Now - and I wouldn't trust any run that fast, to be honest.

Jon Skeet 2009-07-14 13:36:19

Answer 7

+6 A:

Note: this answer applies more to Java than it does to C#, since C# doesn't have an indexer on LinkedLists, but I think the general point still holds.

If the list you're working with happens to be a LinkedList, the performance of the indexer-code (array-style accessing) is a lot worse than using the IEnumerator from the foreach, for large lists.

When you access element 10.000 in a LinkedList using the indexer syntax: list[10000], the linked list will start at the head node, and traverse the Next-pointer ten thousand times, until it reaches the correct object. Obviously, if you do this in a loop, you will get:

list[0]; // head
list[1]; // head.Next
list[2]; // head.Next.Next
// etc.

When you call GetEnumerator (implicitly using the forach-syntax), you'll get an IEnumerator object that has a pointer to the head node. Each time you call MoveNext, that pointer is moved to the next node, like so:

IEnumerator em = list.GetEnumerator();  // Current points at head
em.MoveNext(); // Update Current to .Next
em.MoveNext(); // Update Current to .Next
em.MoveNext(); // Update Current to .Next
// etc.

As you can see, in the case of LinkedLists, the array indexer method becomes slower and slower, the longer you loop (it has to go through the same head pointer over and over again). Whereas the IEnumerable just operates in constant time.

Of course, as Jon said this really depends on the type of list, if the list is not a LinkedList, but an array, the behavior is completely different.

Tom Lokhorst 2009-07-14 12:03:23

LinkedList in .NET doesn't have an indexer, so it's not actually an option.

Jon Skeet 2009-07-14 13:35:25

Oh, well that solves that problem, then :-) I'm just looking through the `LinkedList<T>` docs on MSDN, and it has a pretty decent API. Most importantly, it doesn't have a `get(int index)` method, like Java does. Still, I guess the point still holds for any other list-like data structure that exposes an indexer that's slower than a specific `IEnumerator`.

Tom Lokhorst 2009-07-14 13:53:46

Answer 8

+2 A:

I feel that the "right way" is less about speed and more about readability. Others have brought this up, but I'd like to phrase it a little differently: Good code is idiomatic code. Pythonistas are always going on about "Pythonic" code, and all languages have a similar idea.

Both Python and C# have the benefit of being developed by an authority, so there is a more or less definitive source of what the most idiomatic approach is in a given situation. (It doesn't matter that C# has an evil corporation and Python has a raging egomaniac; the end result is the same. ;) )

There is a certain amount of intuition that goes into determining whether a particular piece of code is idiomatic in that particular language or not. Even so, consensus is surprising common among those who speak the language fluently.

Unless performance is absolutely critical, always try to code in the vernacular. If it looks like classic C, you might be speaking Latin.

WCWedin 2009-07-14 13:46:43

Answer 9

+2 A:

A for loop gets compiled to code approximately equivalent to this:

        int tempCount = 0;
        while (tempCount < list.Count)
        {
            if (list[tempCount].value == value)
            {
                // Do something
            }
            tempCount++;
        }

Where as a foreach loop gets compiled to code approximately equivalent to this:

        using (IEnumerator<T> e = list.GetEnumerator())
        {
            while (e.MoveNext())
            {
                T o = (MyClass)e.Current;
                if (row.value == value)
                {
                    // Do something
                }
            }
        }

So as you can see it would all depend upon how the enumerator is implemented verses how the lists indexer is implemented. As it turns out the enumerator for types based on arrays are normally written something like this:

    private static IEnumerable<T> MyEnum(List<T> list)
    {
        for (int i = 0; i < list.Count; i++)
        {
            yield return list[i];
        }
    }

So as you can see, in this instance it won't make very much difference, however the enumerator for a linked list would probably look something like this:

    private static IEnumerable<T> MyEnum(LinkedList<T> list)
    {
        LinkedListNode<T> current = list.First;
        do
        {
            yield return current.Value;
            current = current.Next;
        }
        while (current != null);
    }

In .Net you will find that the LinkedList class does not even have an indexer so you wouldn't be able to do your for loop on a linked list, but if you could the indexer would have to be written like so:

    public T this[int index]
    {
           LinkedListNode<T> current = this.First;
           for (int i = 1; i <= index; i++)
           {
                current = current.Next;
           }
           return current.value;
    }

As you can see calling this multiple times in a loop is going to be much slower than using an enumerator that can remember where it is in the list.

Martin Brown 2009-07-14 15:38:27

ansaurus

tags:

views:

answers:

For vs Foreach loop in C#

related questions