tags:

views:

132

answers:

2

Okay, to build on my previous question: http://stackoverflow.com/questions/3999761/generically-checking-for-null-that-wont-box-nullables-on-a-non-constrained-type

One user suggested putting a constraint for class and for struct and I also implemented the UnboxT pattern of specializing for the three types and storing that logic in delegates. I was also told to try using OfType<T>.

Here are my methods::

public static class Check<T>
{
 public static readonly Func<T,bool> IfNull = CreateIfNullDelegate();
 private static bool AlwaysFalse(T obj)
 {
  return false;
 }

 private static bool ForRefType(T obj)
 {
  return object.ReferenceEquals(obj, null);
 }

 private static bool ForNullable<Tu>(Tu? obj) where Tu:struct
 {
  return !obj.HasValue;
 }
 private static Func<T,bool> CreateIfNullDelegate()
 {
  if (!typeof(T).IsValueType)
   return ForRefType;
  else
  {
   Type underlying;
   if ((underlying = Nullable.GetUnderlyingType(typeof(T))) != null)
   {
    return Delegate.CreateDelegate(
     typeof(Func<T,bool>),
     typeof(Check<T>)
      .GetMethod("ForNullable",BindingFlags.NonPublic | BindingFlags.Static)
       .MakeGenericMethod(underlying)
    ) as Func<T,bool>;
   }
   else
   {
    return AlwaysFalse;
   }
  }
 }
}
public static int CountNonNull<T>(this IEnumerable<T> enumerable) where T:class
{
 return enumerable.Count(x=>Object.ReferenceEquals(x,null));
}
public static int CountNonNull<T>(this IEnumerable<T?> enumerable) where T : struct
{
 return enumerable.Count(x=>x!=null);
}
public static int CountNonNull3<T>(this IEnumerable<T> enumerable)
{
 return enumerable.OfType<T>().Count();
}
public static int CountNonNull2<T>(this IEnumerable<T> enumerable)
{
 return enumerable.Count(Check<T>.IfNull);
}
  public static void Profile(this Action action, string name, int times = 1 * 1000 * 1000, bool display = true)
  {
   for (int i = 0; i < 100; i++)
   {
    action();
   }
   var _stopwatch = Stopwatch.StartNew();
   for (int i = 0; i < times; i++)
   {
    action();
   }
   _stopwatch.Stop();
   if (display)
   {
    Console.WriteLine("{0}: did {1} iterations in {2} ms", name, times, _stopwatch.ElapsedMilliseconds);
   }
  }

And here are my test sets::

var ints = Enumerable.Range(0,10).ToArray();
var nullableInts = Array.ConvertAll(ints,x=>x as int?);
var strings = Array.ConvertAll(ints,x => x.ToString());
Profile(() => nullableInts.CountNonNull(), "int?");
Profile(() => strings.CountNonNull(), "strings");
Profile(() => nullableInts.CountNonNull2(), "int? 2");
Profile(() => strings.CountNonNull2(), "strings 2");
Profile(() => nullableInts.CountNonNull3(), "int? 3");
Profile(() => strings.CountNonNull3(), "strings 3");

And here are my results::

int?: did 1000000 iterations in 188 ms
strings: did 1000000 iterations in 2346 ms
int? 2: did 1000000 iterations in 180 ms
strings 2: did 1000000 iterations in 147 ms
int? 3: did 1000000 iterations in 4120 ms
strings 3: did 1000000 iterations in 859 ms

The OfType<T> being slow makes sense at has to do an is then a cast. Which means it has to do two loops through the collections to determine its results (although int? time is pretty hard to believe)

But the first and the second approach both are performing the same predicates. Why does the first one perform so slowly on strings, where as the second one performs like a champ?

Edit: for extra craziness: Adding another method to the trial::

public static int CountNonNull4(this System.Collections.IEnumerable enumerable)
{
    return enumerable.Cast<object>().Count(x => object.ReferenceEquals(x, null));
}

This version yields;: strings 4: did 1000000 iterations in 677 ms This makes almost no sense. What is going on?

A: 

You realize that StopWatch doesn't take actual thread activity into account, right? It's the equivalent of timing your commute to work in the morning; there are a lot of things that could impede your progress by varying amounts from day to day (a light you catch one day that stops you the next, traffic jams, etc).

The analogy holds pretty well in the computer; the OS could have interrupted your thread to do something else, your thread could have had to wait for page file operations (expansion, swapping), etc. Try running each algorithm 2 or 3 times and average the times. Also, make sure your application is running in FullTrust, which bypasses all security (but not runtime integrity) permission checks. Lastly, if you can somehow multithread this profiler, you can obtain metrics about the actual number of cycles the algorithm needed from the CPU, which will be independent of thread scheduling delays.

KeithS
I've run the test several times and the results are always within an acceptable range. I can't imagine there possibly being any reason that a machine that is doing nothing but running VS2010, a web browser, and this little application will possibly create a swapping or threading issue. Also swapping execution order does not change anything. Object.ReferenceEquals called from the annonymous method is always the most expensive part of the operation next to the is check on the nullable int.
Michael B
A: 

It's the call to enumerable.Count. When I increase the size of the array by 1000 and decrease the iterations of the performance test by 1000, they perform almost the same and I get the following results:

int?: did 1000 iterations in 488 ms
strings: did 1000 iterations in 437 ms

With this test, the string version is actually faster.

The reason for this, it seems, is that for the struct version, the compiler can inline the call to enumerable.Count. However, for the class version, it generates the IL below. Explanations are in the code.

.method public hidebysig static int32  CountNonNull<class T>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!T> enumerable) cil managed
{
    .custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() = ( 01 00 00 00 ) 
    .maxstack  4
    .locals init ([0] int32 CS$1$0000)
    IL_0000:  nop
    IL_0001:  ldarg.0
    IL_0002:  ldnull
    // Push the lambda for ReferenceEquals onto the stack.
    IL_0003:  ldftn      bool ConsoleApplication7.Program::'<CountNonNull>b__8'<!!0>(!!0)
    // Create a new delegate for the lambda.
    IL_0009:  newobj     instance void class [mscorlib]System.Func`2<!!T,bool>::.ctor(object,
                                                                                    native int)
    // Call the Count Linq method.
    IL_000e:  call       int32 [System.Core]System.Linq.Enumerable::Count<!!0>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>,
                                                                                class [mscorlib]System.Func`2<!!0,bool>)
    IL_0013:  stloc.0
    IL_0014:  br.s       IL_0016
    IL_0016:  ldloc.0
    IL_0017:  ret
}

For the struct version, it doesn't have to do any of this and it just inlines something like this:

var enumerator = enumerable.GetEnumerator();
int result = 0;

try
{
    while (true)
    {
        var current = enumerator.Current;

        if (current.HasValue)
            result++;

        if (!enumerator.MoveNext())
            break;
    }
}
finally
{
    enumerator.Dispose();
}

This is of course much faster than the IL for the class version.

Pieter
Very interesting! Why would calling the Anonymous Method delegate, which is also cached on a static class, be slower than calling the named delegate found on my generic class?
Michael B
Reflector seems to suggest that its the `MethodGroup conversion`, which is normally very fast, but I guess after a million iteration the conversion is seriously slowing down. Now why the hell is this only a problem for reference type conversions and not-struct types?
Michael B
Uhm, I'm looking a little bit further, and it seems that not the CountNonNull is slow, but the enumerable.Count is slow. Updated the answer and looking further.
Pieter
Well, there you go. The complete explanation including some IL for good measure.
Pieter
Reflector isn't showing me that. Reflector shows me that the IL is pretty much the same except the word class gets replaced with value type. Unless your saying that the JIT will do that.
Michael B
Well, the thing is, the results above are from debug mode. You are correct that in release mode, the functions look quite the same. I do however expect the JIT compiler to do something similar for release mode.
Pieter