ansaurus

Question

Is MulticastDelegate.CombineImpl inefficient?

Answer 1

+1 A:

which seems extremely inefficient for such a fundamental performance critical feature

Just how often do you think delegates are attached to events (or combined at other times)?

For example, in a Windows Forms app this is likely to happen pretty rarely - basically when setting up a form, for the most part... at which point there are far heavier things going on than what's in MulticastDelegate.CombineImpl.

What does happen very often is that delegates are invoked... for example, for every item in every projection or predicate (etc) in a LINQ query. That's the really performance critical bit, IMO.

I'm also not convinced that this code is as inefficient as you think it is. It's taking the same approach as ArrayList in terms of creating a larger array than needed, to fill it as it's required. Would a linked list be more efficient? Possibly in some terms - but equally it would be less efficient in terms of locality and levels of indirection. (As each node would need to be a new object which itself contained a reference to the delegate, so navigating the list could end up bringing more pages into memory than an array of references would.)

EDIT: Just as a quick microbenchmark (with all the usual caveats) here's some code to perform a given number of iterations of combining a given number of delegates:

using System;
using System.Diagnostics;

class Test
{
    const int Iterations = 10000000;
    const int Combinations = 3;

    static void Main()
    {
        // Make sure all paths are JITted
        Stopwatch sw = Stopwatch.StartNew();
        sw.Stop();
        Action tmp = null;
        for (int j = 0; j < Combinations; j++)
        {
            tmp += Foo;
        }

        sw = Stopwatch.StartNew();
        for (int i = 0; i < Iterations; i++)
        {
            Action action = null;
            for (int j = 0; j < Combinations; j++)
            {
                action += Foo;
            }
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);
    }

    static void Foo()
    {
    }
}

Some results on my machine, all with 10,000,000 iterations:

5 delegates: about 5.8 seconds
4 delegates: about 4.3 seconds
3 delegates: about 3.2 seconds
2 delegates: about 1.4 seconds
1 delegate: about 160ms

(All tests run multiple times; the above are just samples which seemed reasonably representative. I haven't taken the average or anything.)

Given the above results, I suspect that any paths even in combination-heavy WPF which only attach a single delegate will be blazingly fast. They'll slow down significantly going from 1-2, then gradually degrade (but with a lot less proportional difference than the 1-2 case).

Jon Skeet 2010-10-24 15:59:59

Great answer Jon. When looking at technologies like WPF data binding, I'm unconvinced that event attach/detach is something which happens *rarely*. If you profile a grid, you'll see WPF data binding goes ballistic with attaching to every object down every property path... a common scenario where the attach/detach is more performance critical than the actual invoke.

Mark 2010-10-25 06:50:08

@Mark: Your comment wasn't exactly clear to me in terms of what you mean by "every property path" - could you give more details? For example, in a 20x30 grid, how many attach/detach occurrences will occur - and will they be one-off, or happening continuously over program execution? Have you timed how long it takes to execute that many combine/remove operations? (I'm just going to time some now, and will edit my answer with the results.)

Jon Skeet 2010-10-25 06:54:55

Mark 2010-10-25 17:36:22

Answer 2

+1 A:

It's the exact opposite, this code was optimized by not using List<> or similar collection object. Everything that List does is inlined here. The added advantage is that the locking is cheap (TrySetSlot uses Interlocked.CompareExchange) and saving the cost of dragging around a locking object. Explicitly inlining code like this rather than leaving it up to the JIT compiler isn't that common in the .NET framework. But exceptions are made for low-level primitives like this.

Hans Passant 2010-10-24 17:27:40

Good feedback - thanks. Do you know why the internals are handled using arrays instead of linked lists? Given how frequently technologies like WPF dat grids attach and detach events, I'd have thought that a linked list would be more performance friendly, no?

Mark 2010-10-25 06:45:56

@Mark - Linked lists perform *very* poorly on modern CPU cores due to their poor cache locality.

Hans Passant 2010-10-25 08:10:03

@Hans: +10000 (if I could) for that comment... I had myopically overlooked the fact that CPU cache and locality plays a part in performance. Please strike my short-sightedness form the record :)

Mark 2010-10-25 17:30:47

@Hans: Jon's little benchmark shows an interresting jump as more delegates are added to an event... combining 5 delegates takes 36 times longer than attaching just 1. Would linked lists have performed worse than this?

Mark 2010-10-25 17:39:03

@Mark - no, it is roughly linear. The case of *one* target is treated specially. Check the _methodPtr and _target fields in the Delegate class, the base class for MulticastDelegate. They store the single target, _invocationList isn't used. Yet another optimization.

Hans Passant 2010-10-25 18:03:12

ansaurus

tags:

views:

answers:

Is MulticastDelegate.CombineImpl inefficient?

related questions