ansaurus

Question

Benchmarking small code samples in C#, can this implementation be improved?

Answer 1

+4 A:

If you want to take GC interactions out of the equation, you may want to run your 'warm up' call after the GC.Collect call, not before. That way you know .NET will already have enough memory allocated from the OS for the working set of your function.

Keep in mind that you're making a non-inlined method call for each iteration, so make sure you compare the things you're testing to an empty body. You'll also have to accept that you can only reliably time things that are several times longer than a method call.

Also, depending on what kind of stuff you're profiling, you may want to do your timing based running for a certain amount of time rather than for a certain number of iterations -- it can tend to lead to more easily-comparable numbers without having to have a very short run for the best implementation and/or a very long one for the worst.

Jonathan 2009-06-26 04:18:52

good points, would you have a time based implementation in mind?

Sam Saffron 2009-06-26 05:51:26

Answer 2

+2 A:

I'd avoid passing the delegate at all:

Delegate call is ~ virtual method call. Not cheap: ~ 25% of smallest memory allocation in .NET. If you're interested in details, see e.g. this link.
Anonymous delegates may lead to usage of closures, that you won't even notice. Again, accessing closure fields is noticeably than e.g. accessing a variable on the stack.

An example code leading to closure usage:

public void Test()
{
  int someNumber = 1;
  Profiler.Profile("Closure access", 1000000, 
    () => someNumber + someNumber);
}

If you're not aware about closures, take a look at this method in .NET Reflector.

Alex Yakunin 2009-06-26 04:19:31

Interesting points, but how would you create a re-usable Profile() method if you don't pass a delegate? Are there other ways to pass arbitrary code to a method?

Ash 2009-11-07 07:54:34

We use "using (new Measurement(...)) { ... measured code ... }". So we get Measurement object implementing IDisposable instead of passing the delegate. See http://code.google.com/p/dataobjectsdotnet/source/browse/Xtensive.Core/Xtensive.Core/Diagnostics/Measurement.cs

Alex Yakunin 2009-11-09 12:30:02

This won't lead to any issues with closures.

Alex Yakunin 2009-11-09 12:30:51

Answer 3

A:

Skeet's got a small benchmarking framework that is pretty good. Might not hurt to check that out.

JP Alioto 2009-06-26 04:22:25

Just checked it out: it suffers from all the issues listed above.

Alex Yakunin 2009-06-26 04:27:20

Answer 4

A:

You must also run a "warm up" pass prior to actual measurement to exclude the time JIT compiler spends on jitting your code.

Alex Yakunin 2009-06-26 04:22:32

it is performed prior to measurement

Sam Saffron 2009-06-26 05:54:50

Answer 5

+1 A:

I think the most difficult problem to overcome with benchmarking methods like this is accounting for edge cases and the unexpected. For example - "How do the two code snippets work under high CPU load/network usage/disk thrashing/etc." They're great for basic logic checks to see if a particular algorithm works significantly faster than another. But to properly test most code performance you'd have to create a test that measures the specific bottlenecks of that particular code.

I'd still say that testing small blocks of code often has little return on investment and can encourage using overly complex code instead of simple maintainable code. Writing clear code that other developers, or myself 6 months down the line, can understand quickly will have more performance benefits than highly optimized code.

Paul Alexander 2009-06-26 04:22:36

significant is one of those terms that is really loaded. sometimes having an implementation that is 20% faster is significant, sometimes it has to be 100 times faster to be significant. Agree with you on clarity see: http://stackoverflow.com/questions/1018407/what-is-the-most-elegant-way-to-get-a-set-of-items-by-index-from-a-collection/1025137#1025137

Sam Saffron 2009-06-26 05:54:18

Answer 6

+1 A:

Finalisation won't necessarily be completed before GC.Collect returns. The finalisation is queued and then run on a separate thread. This thread could still be active during your tests, affecting the results.

If you want to ensure that finalisation has completed before starting your tests then you might want to call GC.WaitForPendingFinalizers, which will block until the finalisation queue is cleared:

GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();

LukeH 2009-06-26 09:50:22

+1 good point !

Sam Saffron 2009-06-26 11:25:11

Answer 7

+3 A:

Here is the modified function: as recommended by the community, feel free to amend this its a community wiki.

static void Profile(string description, int iterations, Action func) {

    // clean up
    GC.Collect();
    GC.WaitForPendingFinalizers();
    GC.Collect();

    // warm up 
    func();

    var watch = Stopwatch.StartNew();
    for (int i = 0; i < iterations; i++) {
        func();
    }
    watch.Stop();
    Console.Write(description);
    Console.WriteLine(" Time Elapsed {0} ms", watch.ElapsedMilliseconds);
}

Sam Saffron 2009-06-26 11:45:29

You might want to unroll the loop by some number of times, like 10, to minimize the loop overhead.

Mike Dunlavey 2009-06-26 12:49:07

I just updated to use Stopwatch.StartNew. Not a functional change, but saves one line of code.

LukeH 2009-06-26 13:23:46

@Luke, great change (I wish I could +1 it). @Mike im not sure, i suspect the virtualcall overhead will be much higher that the comparison and assignment, so the performance diff will be negligible

Sam Saffron 2009-06-27 00:05:54

I'd propose you to pass iteration count to the Action, and create the loop there (possibly - even unrolled). In case you're measuring relatively short operation this is the only option.And I'd prefer seeing inverse metric - e.g. count of passes/sec.

Alex Yakunin 2009-06-28 12:37:49

Answer 8

+1 A:

I'd call func() several times for the warm-up, not just one.

Alexey Romanov 2009-07-22 05:15:15

The intention was to ensure jit compilation is performed, what advantage do you get from calling func multiple times prior to measurement?

Sam Saffron 2009-07-22 05:17:07

To give the JIT a chance to improve its first results.

Alexey Romanov 2009-07-22 05:55:35

ansaurus

tags:

views:

answers:

Benchmarking small code samples in C#, can this implementation be improved?

related questions