I have seen that sometimes the performance of LINQ to Objects queries can be improved significantly if they forced to execute immediately by using .ToArray()
, but can't quite understand why. For example, in the sample below, the execution of the function Deferred()
is much slower than the function Immediate()
, and grows exponentially with the value of limit
(perhaps it was exponential in both functions, but execution time with Immediate()
was too little for me to say definitively).
public void Deferred()
{
var all = Range(limit);
var even = from e in EvenRange(limit) where all.Contains(e) select e;
var odd = from o in OddRange(limit) where !even.Contains(o) select o;
var query = from q in odd select q;
foreach(var i in query) { var j = i+1; }
}
public void Immediate()
{
var all = Range(limit);
var even = (from e in EvenRange(limit) where all.Contains(e) select e)
.ToArray();
var odd = (from o in OddRange(limit) where !even.Contains(o) select o)
.ToArray();
var query = (from q in odd select q).ToArray();
foreach(var i in query) { var j = i+1; }
}
public static IEnumerable<int> OddRange(int stop)
{
for (int i = 1; i < stop; i+=2) yield return i;
}
public static IEnumerable<int> EvenRange(int stop)
{
for (int i = 2; i < stop; i+=2) yield return i;
}
public static IEnumerable<int> Range(int stop)
{
for (int i = 0; i < stop; ++i) yield return i;
}
What causes this slowdown? Is it just the extra code that the compiler generates and has to run every time I access the enumerable, or does Deferred()
also go into some extra loops, changing the "Big O" complexity of the query?