views:

114

answers:

4

I was curious to see what the performance differences between returning a value from a method, or returning it through an Action parameter.

There is a somewhat related question to this http://stackoverflow.com/questions/2082735/performance-of-calling-delegates-vs-methods

But for the life of me I can't explain why returning a value would be ~30% slower than calling a delegate to return the value. Is the .net Jitter (not compiler..) in-lining my simple delegate (I didn't think it did that)?

class Program
{
    static void Main(string[] args)
    {
        Stopwatch sw = new Stopwatch();
        sw.Start();

        A aa = new A();

        long l = 0;
        for( int i = 0; i < 100000000; i++ )
        {
            aa.DoSomething( i - 1, i, r => l += r );
        }

        sw.Stop();
        Trace.WriteLine( sw.ElapsedMilliseconds + " : " + l );

        sw.Reset();
        sw.Start();

        l = 0;
        for( int i = 0; i < 100000000; i++ )
        {
            l += aa.DoSomething2( i - 1, i );
        }

        sw.Stop();
        Trace.WriteLine( sw.ElapsedMilliseconds + " : " + l );
    }
}
class A
{
    private B bb = new B();

    public void DoSomething( int a, int b, Action<long> result )
    {
        bb.Add( a,b, result );
    }

    public long DoSomething2( int a, int b  )
    {
        return bb.Add2( a,b );
    }

}
class B
{
    public void Add( int a, int b, Action<long> result )
    {
        result( a + b );
    }

    public long Add2( int i, int i1 )
    {
        return i + i1;
    }
}
A: 

Why not use reflector to find out?

Kell
+1  A: 

Strangely, I'm not seeing the behavior you're describing when running a Release build in VS. I am seeing it when running a Debug build. The only thing I can figure is that there's added overhead with the return-based approach when running the Debug build, though I'm not clever enough to see why.

Here's something else that's interesting: this discrepancy disappears when I switch to a x64 build (Release or Debug).

If I were to venture a guess (completely unsubstantiated), it might be that the cost of passing the 64-bit long as a return value in both B.Add2 and A.DoSomething2 outweighs that of passing the Action<long> in a 32-bit environment. In a 64-bit environment, this savings would vanish as the Action<long> would require 64 bits as well. In a Release build in either configuration, the cost of passing the long probably disappears as both B.Add2 and A.DoSomething2 seem like prime candidates for inlining.

Somebody who knows way more about this than I do: feel free to totally refute everything I just said. We're all here to learn, after all ;)

Dan Tao
The reason that you get that result in the Debug build is actually not that the method call is slower than calling the delegate. The result is simply flawed because the overall overhead for loading and jitting is much higher in the Debug version. To fix this, you would have to add some warm-up code so that everything is already loaded and jitted *before* you start measuring.
0xA3
You could verify your guess by simply replacing `long` with `int`. What I'd rather assume though is that the 64-bit jitter is able to perform some optimization such as inlining which the 32-bit jitter does not apply. As far as I know, the 64-bit and 32-bit runtime have been developed separately.
0xA3
@0xA3: Yeah, my hypothesis didn't seem to hold up, sadly (for me). Changing `long` to `int` did not reverse the strange discrepancy noted by the OP (nor did adding warm-up code, actually, as you can also see from @Brian's answer).
Dan Tao
The oddest thing that stuck out for me was that the traditional return method was only marginally faster with a release build when ran by vshost.exe, but it is dramatically faster when ran standalone. I wonder if vshost runs the CLR with a flag that disables method inlining and perhaps other optimizations?
Brian Gideon
as i mentioned above - warm up code didn't make that large a difference so i left it out.
headsling
+1  A: 

Well for starters your call to new A() is being timed the way you currently have your code set up. You need to make sure you're running in release mode with optimizations on as well. Also you need to take the JIT into account--prime all your code paths so you can guarantee they are compiled before you time them (unless you are concerned about start-up time).

I see an issue when you try to time a large quantity of primitive operations (the simple addition). In this case you can't make any definitive conclusions since any overhead will completely dominate your measurements.

edit: In release mode targeting .NET 3.5 in VS2008 I get:

1719 : 9999999800000000
1337 : 9999999800000000

Which seems to be consistent with many of the other answers. Using ILDasm gives the following IL for B.Add:

  IL_0000:  ldarg.3
  IL_0001:  ldarg.1
  IL_0002:  ldarg.2
  IL_0003:  add
  IL_0004:  conv.i8
  IL_0005:  callvirt   instance void class [mscorlib]System.Action`1<int64>::Invoke(!0)
  IL_000a:  ret

Where B.Add2 is:

  IL_0000:  ldarg.1
  IL_0001:  ldarg.2
  IL_0002:  add
  IL_0003:  conv.i8
  IL_0004:  ret

So it looks as though you're pretty much just timing a load and callvirt.

Ron Warholic
Thanks for the comments - clearly release mode (which i didn't test) makes a difference in the results .. i'm still confused as to why debug mode makes a difference in this case. My test times are not materially effected by warm up (which i omitted for brevity) and the new A() given the order and the magnitude of the iterations.
headsling
+1  A: 

I made a couple of changes to your code.

  • Moved new A() before the timed section.
  • Added warmup code before the timed section to get the methods JIT'ed.
  • Created an Action<long> reference before the timed section and loop so that it does not have to be created on each iteration. This one seemed to have a big impact on execution time.

Here are my results after making the above changes. The vshost column indicates whether the code was executing inside the vshost.exe process (by running directly from Visual Studio). I was using Visual Studio 2008 and targeted .NET 3.5 SP1.

vshost?   Debug   Release
-------------------------
 YES       6405     3827
          11059     3092

 NO        4214     1691
           4607      811

Notice how you get different results depending on the build configuration and the execution environment. The results are interesting if nothing else. If I get time I might edit my answer to provide a theory.

Brian Gideon
warmup code wasn't making much difference in my numbers to i removed them for brevity. I specifically left the action lambda's in as i wanted to see what the cost would be on both times and memory usage (very little difference!) I really have to remember to test outside of VS! cheers
headsling
@headsling: The warmup code made no difference for me either. I sort of expected that. But, I did see a significant difference by lifting the action delegate outside of the loop. That now makes me wonder...could that be one of the optimizations performed anyway in a release build? Lifting instructions outside of loops is not new so it is reasonable.
Brian Gideon
interesting - i agree that it seems likely that the release build might lift my action delegate out... i'll have a play with that
headsling