views:

315

answers:

3

A few days back, while writing an answer for this question here on overflow I got a bit surprised by the C# compiler, who wasn’t doing what I expected it to do. Look at the following to code snippets:

First:

object[] array = new object[1];

for (int i = 0; i < 100000; i++)
{
    ICollection<object> col = (ICollection<object>)array;
    col.Contains(null);
}

Second:

object[] array = new object[1];

for (int i = 0; i < 100000; i++)
{
    ICollection<object> col = array;
    col.Contains(null);
}

The only difference in code between the two snippets is the cast to ICollection<object>. Because object[] implements the ICollection<object> interface explicitly, I expected the two snippets to compile down to the same IL and be therefore identical. However, when running performance tests on them, I noticed the latter to be about 6 times as fast as the former.

After comparing the IL from both snippets, I noticed the both methods were identical, except for a castclass IL instruction in the first snippet.

Surprised by this I now wonder why the C# compiler isn’t ‘smart’ here. Things are never as simple as it seems, so why is the C# compiler a bit naïve here?

+4  A: 

This is a rough guess, but i think it's about the Array's relationship to its generic IEnumerable.

In the .NET Framework version 2.0, the Array class implements the System.Collections.Generic.IList, System.Collections.Generic.ICollection, and System.Collections.Generic.IEnumerable generic interfaces. The implementations are provided to arrays at run time, and therefore are not visible to the documentation build tools. As a result, the generic interfaces do not appear in the declaration syntax for the Array class, and there are no reference topics for interface members that are accessible only by casting an array to the generic interface type (explicit interface implementations). The key thing to be aware of when you cast an array to one of these interfaces is that members which add, insert, or remove elements throw NotSupportedException.

See MSDN Article.

It's not clear whether this relates to .NET 2.0+, but in this special case it would make perfect sense why the compiler cannot optimize your expression if it only becomes valid at run time.

Marcel J.
Your guess isn't that bad at all, because this only seems to happen with arrays. When you change the lines "object[] array = new object[1];" to "Collection<object> array = new Collection<object>();" the compiler succeeds in optimizing the cast away. However, I'm not completely satisfied, because while the article states “implementations are provided to arrays at run time”, the C# compiler actually knows that T[] implements ICollection<T>. Otherwise the statement "ICollection<object> col = array;" would not compile. We still don’t know what exactly is going on here.
Steven
+21  A: 

My guess is that you have discovered a minor bug in the optimizer. There is all kinds of special-case code in there for arrays. Thanks for bringing it to my attention.

Eric Lippert
Only Eric can post an answer that says "oops, we made a boo-boo" and receive 10 upvotes. Excellent! ;)
Aaronaught
+2  A: 

This doesn't look like more than just a missed opportunity in the compiler to suppress the cast. It will work if you write it like this:

    ICollection<object> col = array as ICollection<object>;

which suggests that it gets too conservative because casts can throw exceptions. However, it does work when you cast to the non-generic ICollection. I'd conclude that they simply overlooked it.

There's a bigger optimization issue at work here, the JIT compiler doesn't apply the loop invariant hoisting optimization. It should have re-written the code like this:

object[] array = new object[1];
ICollection<object> col = (ICollection<object>)array;
for (int i = 0; i < 100000; i++)
{
    col.Contains(null);
}

Which is a standard optimization in the C/C++ code generator for example. Still, the JIT optimizer can't burn a lot of cycles on the kind of analysis required to discover such possible optimizations. The happy angle on this is that optimized managed code is still quite debuggable. And that there still is a role for the C# programmer to write performant code.

Hans Passant
When talking about the JIT, also the JIT didn't optimize the cast away. That the JIT doesn't do this is very explainable as you already said: "the JIT optimizer can't burn a lot of cycles".
Steven