ansaurus

Question

What guarantees are there on the run-time complexity (Big-O) of LINQ methods?

Answer 1

A:

All you can really bank on is that the Enumerable methods are well-written for the general case and won't use naive algorithms. There is probably third-party stuff (blogs, etc.) that describe the algorithms actually in use, but these are not official or guaranteed in the sense that STL algorithms are.

Marcelo Cantos 2010-05-09 22:46:38

Answer 2

+2 A:

I just broke out reflector and they do check the underlying type when Contains is called.

public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value)
{
    ICollection<TSource> is2 = source as ICollection<TSource>;
    if (is2 != null)
    {
        return is2.Contains(value);
    }
    return source.Contains<TSource>(value, null);
}

ChaosPandion 2010-05-09 22:46:59

Thanks, good to know.

tzaman 2010-05-10 05:36:51

Answer 3

+8 A:

There are very, very few guarantees, but there are a few optimizations:

Extension methods that use indexed access, such as ElementAt, Skip, Last or LastOrDefault, will check to see whether or not the underlying type implements IList<T>, so that you get O(1) access instead of O(N).
The Count method checks for an ICollection implementation, so that this operation is O(1) instead of O(N).
Distinct, GroupBy Join, and I believe also the set-aggregation methods (Union, Intersect and Except) use hashing, so they should be close to O(N) instead of O(N²).
Contains checks for an ICollection implementation, so it may be O(1) if the underlying collection is also O(1), such as a HashSet<T>, but this is depends on the actual data structure and is not guaranteed. Hash sets override the Contains method, that's why they are O(1).
OrderBy methods use a stable quicksort, so they're O(N log N) average case.

I think that covers most if not all of the built-in extension methods. There really are very few performance guarantees; Linq itself will try to take advantage of efficient data structures but it isn't a free pass to write potentially inefficient code.

Aaronaught 2010-05-09 23:16:36

How about the `IEqualityComparer` overloads?

tzaman 2010-05-10 05:35:58

@tzaman: What about them? Unless you use a really inefficient custom `IEqualityComparer`, I can't reason for it to affect the asymptotic complexity.

Aaronaught 2010-05-10 13:37:00

@Aaronaught: Oh, right. I hadn't realized `EqualityComparer` implements `GetHashCode` as well as `Equals`; but of course that makes perfect sense.

tzaman 2010-05-10 15:52:58

Answer 4

A:

The correct answer is "it depends". it depends on what type the underlying IEnumerable is. i know that for some collections (like collections that implement ICollection or IList) there are special codepaths that are used, However the actual implementation is not guaranteed to do anything special. for example i know that ElementAt() has a special case for indexable collections, similarly with Count(). But in general you should probably assume the worst case O(n) performance.

In generaly i don't think you are going to find the kind of performance guarantees you want, though if you do run into a particular performance problem with a linq operator you can always just reimplement it for your particular collection. Also there are many blogs and extensibility projects which extend Linq to Objects to add these kinds of performance guarantees. check out Indexed LINQ which extends and adds to the operator set for more performance benefits.

luke 2010-05-09 23:17:04

ansaurus

tags:

views:

answers:

What guarantees are there on the run-time complexity (Big-O) of LINQ methods?

related questions