views:

1871

answers:

5

When you have code like the following:

static T GenericConstruct<T>() where T : new()
{
    return new T();
}

The C# compiler insists on emitting a call to Activator.CreateInstance, which is considerably slower than a native constructor.

I have the following workaround:

public static class ParameterlessConstructor<T>
    where T : new()
{
    public static T Create()
    {
        return _func();
    }

    private static Func<T> CreateFunc()
    {
        return Expression.Lambda<Func<T>>( Expression.New( typeof( T ) ) ).Compile();
    }

    private static Func<T> _func = CreateFunc();
}

// Example:
// Foo foo = ParameterlessConstructor<Foo>.Create();

But it doesn't make sense to me why this workaround should be necessary.

+4  A: 

I suspect it's a JITting problem. Currently, the JIT reuses the same generated code for all reference type arguments - so a List<string>'s vtable points to the same machine code as that of List<Stream>. That wouldn't work if each new T() call had to be resolved in the JITted code.

Just a guess, but it makes a certain amount of sense.

One interesting little point: in neither case does the parameterless constructor of a value type get called, if there is one (which is vanishingly rare). See my recent blog post for details. I don't know whether there's any way of forcing it in expression trees.

Jon Skeet
+3  A: 

This is likely because it is not clear whether T is a value type or reference type. The creation of these two types in a non-generic scenario produce very different IL. In the face of this ambiguity, C# is forced to use a universal method of type creation. Activator.CreateInstance fits the bill.

Quick experimentation appears to support this idea. If you type in the following code and examine the IL, it will use initobj instead of CreateInstance because there is no ambiguity on the type.

static void Create<T>()
    where T : struct
{
    var x = new T();
    Console.WriteLine(x.ToString());
}

Switching it to a class and new() constraint though still forces an Activator.CreateInstance.

JaredPar
I guess the immediate followup question would be "why isn't there an appropriate IL instruction for creating an instance of a generic type with an appropriate constraint?" It's not like they couldn't have built that in from the start :)
Jon Skeet
Agreed it really seems like they implemented an API instead of an IL instruction. The comment on the MSDN doc page for Activator.CreateInstance specifically says that it should be called for this scenario. Odd choice, I'm sure there's a good reason.
JaredPar
I suspect the reason is to increase JIT'd code sharing. If you had a direct call to a type's constructor in the JIT'd code, then you couldn't share that JIT'd code with another instantiation for a different type, e.g. 'T Create<T>() where T : new() {return new T();}' would share machine code for Create<string>() and Create<ArrayList>().
jonp
A: 

Interesting observation :)

Here is a simpler variation on your solution:

static T Create<T>() where T : new()
{
  Expression<Func<T>> e = () => new T();
  return e.Compile()();
}

Obviously naive (and possible slow) :)

leppie
I don't think that will work, because it's specifically "new T()" that his workaround is trying to avoid.
Joel Mueller
+1  A: 

Why is this workaround necessary?

Because the new() generic constraint was added to C# 2.0 in .NET 2.0.

Expression<T> and friends, meanwhile, were added to .NET 3.5.

So your workaround is necessary because it wasn't possible in .NET 2.0. Meanwhile, (1) using Activator.CreateInstance() was possible, and (2) IL lacks a way to implement 'new T()', so Activator.CreateInstance() was used to implement that behavior.

jonp
+1  A: 

This is a little bit faster, since the expression is only compiled once:

    public class Foo<T> 
        where T : new()
    {
        static Expression<Func<T>> x = () => new T();
        static Func<T> f = x.Compile();

        public static T build()
        {
            return f();
        }
    }

Analyzing the performance, this method is just as fast as the more verbose compiled expression and much, much faster (160 times slower on my test PC) than {return new T();}

For a tiny bit better performance, the build method call can be eliminated and the functor can be returned instead, which the client could cache and call directly.:

        public static Func<T> BuildFn { get { return f; } }