views:

1753

answers:

7
public Int64 ReturnDifferenceA()
{
  User[] arrayList;
  Int64 firstTicks;
  IList<User> userList;
  Int64 secondTicks;
  System.Diagnostics.Stopwatch watch;

  userList = Enumerable
              .Range(0, 1000)
              .Select(currentItem => new User()).ToList();

  arrayList = userList.ToArray();

  watch = new Stopwatch();
  watch.Start();

  for (Int32 loopCounter = 0; loopCounter < arrayList.Count(); loopCounter++)
  {
     DoThings(arrayList[loopCounter]);
  }

  watch.Stop();
  firstTicks = watch.ElapsedTicks;

  watch.Reset();
  watch.Start();
  for (Int32 loopCounter = 0; loopCounter < arrayList.Count(); loopCounter++)
  {
     DoThings(arrayList[loopCounter]);
  }
  watch.Stop();
  secondTicks = watch.ElapsedTicks;

  return firstTicks - secondTicks;
}

As you can see, this is really simple. Create a list of users, force to an array, start a watch, loop the list through and call a method, stop watch. Repeat. Finish up by returning the difference from the first run and the second.

Now I'm calling with these:

differenceList = Enumerable
                 .Range(0, 50)
                 .Select(currentItem => ReturnDifferenceA()).ToList();
average = differenceList.Average();

differenceListA = Enumerable
                  .Range(0, 50)
                  .Select(currentItem => ReturnDifferenceA()).ToList();
averageA = differenceListA.Average();

differenceListB = Enumerable
                  .Range(0, 50)
                  .Select(currentItem => ReturnDifferenceA()).ToList();
averageB = differenceListB.Average();

Now the fun part is that all averages are positive by a relatively large amount, ranging from 150k to 300k ticks.

What I don't get is that I am going through the same list, the same way, with the same method and yet there is such a difference. Is there some kind of caching going on?

Another interesting thing is that if I iterate through the list BEFORE the first stop watch section, the averages are around 5k or so.

+4  A: 

You are running in a high level language with a runtime environment that does a lot of caching and performance optimizations, this is common. Sometimes it is called warming up the virtual machine, or warming up the server (when it is a production application).

If something is going to be done repeatedly, then you will frequently notice the first time has a larger measured runtime and the rest should level off to a smaller amount.

I do this in MATLAB code, and see that the first time I run a benchmark loop, it takes five seconds, and subsequent times take a fifth of a second. It's a huge difference, because it is an interpreted language that required some form of compiling, but in reality, it does not affect your performance, because the great majority will be 'second time's in any production application.

Karl
+3  A: 

It's quite possible that DoThings() is not JIT-compiled to native code until the first time it is called.

James Curran
That is a likely scenario. If DoThings isn't JITed until the loop processing starts the first time you will take a slight perf hit for the JIT compilation. After that, each time through the loop you won't have that hit.
Scott Dorman
It's also possible that the compiler (not the JITer) has optimized the code since your two loops are exactly the same. Another scenario is that the runtime (and the JIT compiler) have optimized the JITed code.
Scott Dorman
+3  A: 

by the way, using IEnumerable.Count() on an Array is hundreds of times slower than Array.Length... Although this doesn't answer the question at all.

Jimmy
It *would be* if IEnumerable.Count always counts each item, but it's smart enough to first try casting the IEnumerable to ICollection, and if that works, use the Count property.
James Curran
Actually, he's right if I understand what you're saying. There is a huge difference between for (Int32 loopCounter = 0; loopCounter < arrayList.Length; loopCounter++) and for (Int32 loopCounter = 0; loopCounter < arrayList.Count(); loopCounter++). Just tested it with stopwatch.
Programmin Tool
@James: I'm aware of the ICollection short path, but it doesn't seem to work, I tested the code as well. I wonder if this is a bug?
Jimmy
A: 

You say you don't get that doing it 3 times, the 2nd and 3rd times are relatively close. It seems to me that it's only the first time thru the loop that things are slow.

GeekyMonkey
A: 

I'd suspect that the function you call is not Just-In-Timed till the first run. What you can try is run it once, then stop it and run it again. With no code changes, the Just-In-Time compiles from the previous run should still be ok, and any remaining optimizations you see are the actual caching effects at work.

GWLlosa
A: 

Because .NET, like the Java Platform, is a JIT environment. All high level .NET code is compiled to Microsoft's intermediate language bytecode.

To run your programme, this bytecode needs to be compiled/translated to native machine code. However, compiled .NET programmed files are not stored in native machine code but in the intermediate virtual machine bytecode.

The first run is JIT compiled, so it took extra time. The subsequent runs no longer need to be JIT compiled but the native code is drawn from the JIT cache, so it should be faster.

Did you keep your application up without terminating on subsequent runs? Then, the second reason is also due to VM. (VM:1 = virtual machine; VM:2 = virtual memory). All modern generalised operating systems run their processes on virtual memory, which is a map of real memory, to allow the operating system the ability to manage and optimise use of system resources. Less used processes are frequently swept off into disk cache to let other processes have optimum use of resources.

Your process was not in virtual memory the first time, so it has to suffer the overhead of being swept into memory. Because subsequently, your process was among the most recently used top-list (aka at the bottom of the least recently used list), it was not swept off into disk cache yet.

Also, resources are doled out by the OS to your process as needed. So for the first round, your process had to go through the pains of pushing the envelope contention with the OS to expand its resource boundaries.

A virtual machine allows .NET and Java to abstract most programming features into a machine independent layer, segregating and hence leaving a smaller mess for the machine dependent software engineers to deal with. Even though Microsoft Windows runs on rather unified x86 descendant hardware, there is sufficient differences with different OS versions and CPU models to warrant an abstracted virtual machine in order to give .NET programmers and users a consistent view.

Blessed Geek
A: 

Leave aside the matter of warming up the VM or machine, of caching, of JIT optimizations, for a moment: what else is your computer doing? Are any of the 3e42 system services and task tray thingies grabbing some CPU? Maybe your Steam client decided to check for updates, or IE needed to do something frightfully important, or your antivirus program got in the way?

Your test is only as useful as the degree to which you can isolate it from all the other software running on your box. Turn off every bit of software you possibly can before attempting to measure a run.

But then what do I know? - maybe your measurement method is managed by the .net (or whatever) runtime too and accounts for only runtime 'virtual cycles'.