views:

200

answers:

4

I have a benchmarking application to test out the performance of some APIs that I have written. In this benchmarking application, I am basically using the QueryPerformanceCounter and getting the timing by dividing the difference of QPC values after and before calling into the API, by the frequency. But the benchmarking results seems to vary If I run the application (same executable running on the same set of Dlls) from different drives. Also, on a particular drive, running the application for the 1st time, closing the application and re-running it again produces different benchmarking results. Can anyone explain this behavior? Am I missing something here?

Some more useful information:

The behavior is like this: Run the application, close it and rerun it again, the benchmarking results seems to improve on the 2nd run and thereafter remains same. This behavior more prominent in case of running from C drive. I would also like to mention that my benchmark app has an option to rerun/retest a particular API without having to close the app. I do understand that there is jitting involved, but what I dont understand is that on the 1st run of app, when u rerun an API multiple times without closing the app, the performance stabilizes after a couple of runs, then when you close and rerun the same test, the performance seems to improve.

Also, how do you account for the performance change when run from different drives?

[INFORMATION UPDATE]

I did an ngen and now the performance difference between the different runs from same location is gone. i.e. If I open the benchmark app, run it once, close it and rerun it from same location, it shows same values.

But I have encountered another problem now. When I launch the app from D drive and run it a couple of times (couple of iterations of APIs within the same launch of benchmark prog), and then from the 3rd iteration onwards, the performance of all APIs seems to fall by around 20%. Then If you close and relaunch the app and run it, for first 2 iterations, it gives correct values (same value as that obtained from C), then again performance falls beyond that. This behavior is not seen when run from C drive. From C drive, no matter how many runs you take, it is pretty consistent.

I am using large double arrays to test my API performance. I was worried that the GC would kick in inbetween the tests so I am calling GC.Collect() & GC.WaitForPendingFinalizers() explictly before and after each test. So I dont think it has anything to do with GC.

I tried using AQ time to know whats happening from 3rd iteration onwards, but funny thing is that When I run the application with AQ time profiling it, the performance does not fall at all.

The performance counter as does not suggest any funny IO activity.

Thanks Niranjan

+3  A: 

Yes. It's called Just-In-Time compiling. Basically your app is deployed as MSIL (the Microsoft Intermediate Language) and the first time it is run it gets converted to native code.

You can always run NGen (see the above article), or have a warm up period in your performance testing scripts where it runs through the scenario a couple of times before actually benchmarking performance.

Cory Foy
Really? Does it actually store the JIT'ed code between runs? From that web page, it doesn't seem so.
paxdiablo
I would think it's more likely to be that the DLLs are already loaded into memory.
paxdiablo
I would like to mention here that my benchmark app has an option to rerun/retest a particular API without having to close the app.
Niranjan U
I do understand about the jitting part, but what I dont understand is that on the 1st run of app, when u rerun an API multiple times without closing the app, the performance stabilizes after a couple of runs, then when you close and rerun the same test, the performance seems to improve.
Niranjan U
Also, how do you account for the performance change when run from different drives?
Niranjan U
+4  A: 

Running an application brings its executable and other files from the hard drive into the OS's disk cache (in RAM). If it is run again soon afterwards, many of these files are likely to still be in cache. RAM is much faster than disk.

And of course one disk may be faster than another.

Crashworks
+1  A: 

I think there are a combination of effects here:

Firstly, running the same function within the test harness multiple times, with the same data each time, will likely improve because:

  • JIT compilation will optimise the code that is run most frequently to improve performance (as mentioned already by Cory Foy)
  • The program code will be in the disk cache (as mentioned already by Crashwork)
  • Some program code will be in the CPU cache if it is small enough and executed frequently enough

If the data is different for each run of the function within the test harness, this could explain why closing and running the test harness again improves results: the data will now also be in the disk cache, where it wasn't the first time.

And finally, yes, even if two 'drives' are on the same physical disk, they will have different performance: data can be read faster from the outside of the disk platter than the inside. If they are different physical disks, then the performance difference would seem quite likely. Also, one disk may be more fragmented than the other, causing longer seek times and slower data transfer rates.

Mike Houston
+1  A: 

Also, other factors are probably coming into play. Filesystem caching on the machine, buffering of recently used data, etc.

Best to run several tests (or several hundred!) and average out across the set, unless you're specifically measuring cold boot times.

Nathanator