ansaurus

Question

Why does the second for loop always execute faster than the first one?

Answer 1

+2 A:

The reason why is there are several forms of overhead in the foreach version that are not present in the for loop

Use of an IDisposable.
An additional method call for every element. Each element must be accessed under the hood by using IEnumerator<T>.Current which is a method call. Because it's on an interface it cannot be inlined. This means N method calls where N is the number of elements in the enumeration. The for loop just uses and indexer
In a foreach loop all calls go through an interface. In general this a bit slower than through a concrete type

Please note that the things I listed above are not necessarily huge costs. They are typically very small costs that can contribute to a small performance difference.

Also note, as Mehrdad pointed out, the compilers and JIT may choose to optimize a foreach loop for certain known data structures such as an array. The end result may just be a for loop.

Note: Your performance benchmark in general needs a bit more work to be accurate.

You should use a StopWatch instead of DateTime. It is much more accurate for performance benchmarks.
You should perform the test many times not just once
You need to do a dummy run on each loop to eliminate the problems that come with JITing a method the first time. This probably isn't an issue when all of the code is in the same method but it doesn't hurt.
You need to use more than just 4 values in the list. Try 40,000 instead.

JaredPar 2009-06-20 14:40:32

I'm sorry I wasn't clear. If I put the foreach loop second then it will execute faster then the for loop.

Bob Dylan 2009-06-20 14:42:01

@JaredPar: `foreach` loop for arrays are optimized by the compiler.

Mehrdad Afshari 2009-06-20 14:59:46

Another to add to the list is to change the thread priority to something higher to minimise the interference of other threads running on the same processor.

RichardOD 2009-06-20 15:01:59

@Mehrdad, I keep forgetting which versions and which structures they optimize the for loop for. But yes, in general they will.

JaredPar 2009-06-20 15:02:03

Answer 2

+7 A:

I would not use DateTime to measure performance - try the Stopwatch class.
Measuring with only 4 passes is never going to give you a good result. Better use > 100.000 passes (you can use an outer loop). Don't do Console.WriteLine in your loop.
Even better: use a profiler (like Redgate ANTS or maybe NProf)

tanascius 2009-06-20 14:42:57

Can you provide some links on how to use the stopwatch class?

Bob Dylan 2009-06-20 14:52:38

Stopwatch Class (via MSDN): http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.aspx

Robert Cartaino 2009-06-20 14:57:24

Thanks Robert ...

tanascius 2009-06-20 15:04:16

Answer 3

+13 A:

Probably because the classes (e.g. Console) need to be JIT-compiled the first time through. You'll get the best metrics by calling all methods (to JIT them (warm then up)) first, then performing the test.

As other users have indicated, 4 passes is never going to be enough to to show you the difference.

Incidentally, the difference in performance between for and foreach will be negligible and the readability benefits of using foreach almost always outweigh any marginal performance benefit.

Gaz 2009-06-20 14:45:51

Answer 4

+1 A:

You should be using the StopWatch to time the behavior.

Technically the for loop is faster. Foreach calls the MoveNext() method (creating a method stack and other overhead from a call) on the IEnumerable's iterator, when for only has to increment a variable.

statenjason 2009-06-20 14:46:43

Answer 5

+3 A:

I am not so much in C#, but when I remember right, Microsoft was building "Just in Time" compilers for Java. When they use the same or similar techniques in C#, it would be rather natural that "some constructs coming second perform faster".

For example it could be, that the JIT-System sees that a loop is executed and decides adhoc to compile the whole method. Hence when the second loop is reached, it is yet compiled and performs much faster than the first. But this is a rather simplistic guess of mine. Of course you need a far greater insight in the C# runtime system to understand what is going on. It could also be, that the RAM-Page is accessed first in the first loop and in the second it is still in the CPU-cache.

Addon: The other comment that was made: that the output module can be JITed a first time in the first loop seams to me more likely than my first guess. Modern languages are just very complex to find out what is done under the hood. Also this statement of mine fits into this guess:

But also you have terminal-outputs in your loops. They make things yet more difficult. It could also be, that it costs some time to open the terminal a first time in a program.

Juergen 2009-06-20 14:47:28

Answer 6

+4 A:

I was just performing tests to get some real numbers, but in the meantime Gaz beat me to the answer - the call to Console.Writeline is jitted at the first call, so you pay that cost in the first loop.

Just for information though - using a stopwatch rather than the datetime and measuring number of ticks:

Without a call to Console.Writeline before the first loop the times were

for: 16802
foreach: 2282

with a call to Console.Writeline they were

for: 2729
foreach: 2268

Though these results were not consistently repeatable because of the limited number of runs, but the magnitude of difference was always roughly the same.

The edited code for reference:

        int[] x = new int[] { 3, 6, 9, 12 };
        int[] y = new int[] { 3, 6, 9, 12 };

        Console.WriteLine("Hello World");

        Stopwatch sw = new Stopwatch();

        sw.Start();
        for (int i = 0; i < 4; i++)
        {
            Console.WriteLine(x[i]);
        }
        sw.Stop();
        long elapsedTime = sw.ElapsedTicks;

        sw.Reset();
        sw.Start();
        foreach (var item in y)
        {
            Console.WriteLine(item);
        }
        sw.Stop();
        long elapsedTime2 = sw.ElapsedTicks;

        Console.WriteLine("\nSummary");
        Console.WriteLine("--------------------------\n");
        Console.WriteLine("for:\t{0}\nforeach:\t{1}", elapsedTime, elapsedTime2);

        Console.ReadKey();

Martin Harris 2009-06-20 14:57:31

If I could vote 100 times i would. This is the answer. So, 2 problems - 1. using writeline in a section of code you are "profiling" 2. JIT is the culprit of the differences.

Tim 2009-06-20 15:05:03

@boat-programmer: Seconded!

Lucas McCoy 2009-06-20 15:08:51

Answer 7

+1 A:

I don't see why everyone here says that for would be faster than foreach in this particular case. For a List<T>, it is (about 2x slower to foreach through a List than to for through a List<T>).

In fact, the foreach will be slightly faster than the for here. Because foreach on an array essentially compiles to:

for(int i = 0; i < array.Length; i++) { }

Using .Length as a stop criteria allows the JIT to remove bounds checks on the array access, since it's a special case. Using i < 4 makes the JIT insert extra instructions to check each iteration whether or not i is out of bounds of the array, and throw an exception if that is the case. However, with .Length, it can guarantee you'll never go outside of the array bounds so the bounds checks are redundant, making it faster.

However, in most loops, the overhead of the loop is insignificant compared to the work done inside.

The discrepancy you're seeing can only be explained by the JIT I guess.

JulianR 2009-06-20 15:09:15

Answer 8

+1 A:

I wouldn't read too much into this - this isn't good profiling code for the following reasons
1. DateTime isn't meant for profiling. You should use QueryPerformanceCounter or StopWatch which use the CPU hardware profile counters
2. Console.WriteLine is a device method so there may be subtle effects such as buffering to take into account
3. Running one iteration of each code block will never give you accurate results because your CPU does a lot of funky on the fly optimisation such as out of order execution and instruction scheduling
4. Chances are the code that gets JITed for both code blocks is fairly similar so is likely to be in the instruction cache for the second code block

To get a better idea of timing, I did the following

Replaced the Console.WriteLine with a math expression ( e^num)
I used QueryPerformanceCounter/QueryPerformanceTimer through P/Invoke
I ran each code block 1 million times then averaged the results

When I did that I got the following results:

The for loop took 0.000676 milliseconds
The foreach loop took 0.000653 milliseconds

So foreach was very slightly faster but not by much

I then did some further experiments and ran the foreach block first and the for block second
When I did that I got the following results:

The foreach loop took 0.000702 milliseconds
The for loop took 0.000691 milliseconds

Finally I ran both loops together twice i.e for + foreach then for + foreach again
When I did that I got the following results:

The foreach loop took 0.00140 milliseconds
The for loop took 0.001385 milliseconds

So basically it looks to me that whatever code you run second, runs very slightly faster but not enough to be of any significance.
--Edit--
Here are a couple of useful links
How to time managed code using QueryPerformanceCounter
The instruction cache
Out of order execution

zebrabox 2009-06-20 15:20:13

ansaurus

tags:

views:

answers:

Why does the second for loop always execute faster than the first one?

related questions