views:

78

answers:

5

What is the quickest way to find out which .net framework linq methods (e.g .IEnumerable linq methods) are implemented using deferred execution vs. which are not implemented using deferred execution.

While coding many times, I wonder if this one will be executed right way. The only way to find out is go to MSDN documentation to make sure. Would there be any quicker way, any directory, any list somewhere on the web, any cheat sheet, any other trick up your sleeve that you can share? If yes, please do so. This will help many linq noobs (like me) to make fewer mistakes. The only other option is to check documentation until one have used them enough to remember (which is hard for me, I tend not to remember "anything" which is documented somewhere and can be looked up :D).

A: 

If you cast the collection to an IQueryable using .AsQueryable(), your LINQ calls will use the deferred execution.

See here: http://stackoverflow.com/questions/1578778/using-iqueryable-with-linq

Babak Naffas
Calling AsQueryavle won't change that at all. It needs to be an appropriate data source for the content on that link to help; LINQ to Objects is still LINQ to objects, even when hiding behind IQueryable
Marc Gravell
@Marc: Not true. IQueryable methods do not evaluate until they have to, regardless of the original source. Anything chained to AsQueryable() that just spits out another IQueryable has basically done nothing more than added a node to the expression tree. Now, if you tack a ToList() on the end, the tree is immediately evaluated right after it's built; it has to, to give you the result you said you need right now. That's all you; you could, depending on the situation, actually make use of the deferred execution by waiting until you need a List to call IQueryable.ToList().
KeithS
@Keith everything you say there applies to "iterator blocks" too. Most of LINQ-to-objects uses iterator blocks (yield return).
Marc Gravell
The key phrase is the linked answer is "with a provider that supports is correctly". AsQueryable doesn't do anything special to parse/interpret expression tress - it just passes them (as delegates) to Enumerable. They can still be deferred, but AsQueryable actually *adds* work here.
Marc Gravell
Looks like I was wrong...learn something new everyday.
Babak Naffas
+4  A: 

Generally methods that return a sequence use deferred execution:

IEnumerable<X> ---> Select ---> IEnumerable<Y>

and methods that return a single object doesn't:

IEnumerable<X> ---> First ---> Y

So, methods like Where, Select, Take, Skip, GroupBy and OrderBy use deferred execution because they can, while methods like First, Single, ToList and ToArray doesn't because they can't.

There are also two types of deferred execution. For example the Select method will only get one item at a time when it's asked to produce an item, while the OrderBy method will have to consume the entire source when asked to return the first item. So, if you chain an OrderBy after a Select, the execution will be deferred until you get the first item, but then the OrderBy will ask the Select for all the items.

Guffa
Substitute "collection" => "sequence" and I'd agree, but *collections* (List<T> etc) are typically *not* deferred.
Marc Gravell
@Marc Gravell: Yes, that is more correct.
Guffa
I think this comment gives a pretty good guideline about deciding between deferred vs not. Apart from that it seems we have to think about what the function is doing and whether it needs entire set of data to fulfill it's function (e.g. you need to access all elements to get Count, but you can defer execution till access for Select.) Also very good point about 2 types of deferred execution. Thank you.
Tejas
+3  A: 

Actually, there's more; in addition you need to consider buffered vs non-buffered. OrderBy can be deferred, but when iterated must consume the entire stream.

In general, anything in LINQ that returns IEnumerable tends to be deferred - while Min etc (which return values) are not deferred. The buffering (vs not) can usually be reasoned, but frankly reflector is a pretty quick way of finding out for sure. But note that often this is an implementation detail anyway.

Marc Gravell
+1, nice point about the buffering.
Kirk Woll
nice point about buffered vs non-buffered. will go look it up now.
Tejas
A: 

For actual "deferred execution", you want methods that work on an IQueryable. Method chains based on an IQueryable work to build an expression tree representing your query. Only when you call a method that takes the IQueryable and produces a concrete or IEnumerable result (ToList() and similar, AsEnumerable(), etc) is the tree evaluated by the Linq provider (Linq2Objects is built into the Framework, as is Linq2SQL and now the MSEF; other ORMs and persistence-layer frameworks also offer Linq providers) and the actual result returned. Any IEnumerable class in the framework can be cast to an IQueryable using the AsQueryable() extension method, and Linq providers that will translate the expression tree, like ORMs, will provide an AsQueryable() as a jump-off point for a linq query against their data.

Even against an IEnumerable, some of the Linq methods are "lazy". Because the beauty of an IEnumerable is that you don't have to know about all of it, only the current element and whether there's another, Linq methods that act on an IEnumerable often return an iterator class that spits out an object from its source whenever methods later in the chain ask for one. Any operation that doesn't require knowledge of the entire set can be lazily evaluated (Select and Where are two big ones; there are others). Ones that do require knowing the entire collection (sorting via OrderBy, grouping with GroupBy, and aggregates like Min and Max) will slurp their entire source enumerable into a List or Array and work on it, forcing evaluation of all elements through all higher nodes. Generally, you want these to come late in a method chain if you can help it.

KeithS
I disagree with the suggestion that AsQueryable changes anything here; in essence this just *adds* a layer of work while the expression-tree gets compiled and then gets passed to Enumerable. Expression trees are only inspected for ORM (etc) sources. LINQ-to-objects just compiles them and invokes the delegate. It doesn't change *anything* re deferred **or** buffered.
Marc Gravell
Seriously; iterator blocks are trivial to write, and fully deferred. LINQ-to-objects in mainly iterator blocks.
Marc Gravell
+1  A: 

The guidelines I use:

  • Always assume any API that returns IEnumerable<T> or IQueryable<T> can and probably will use deferred execution. If you're consuming such an API, and need to iterate through the results more than once (e.g. to get a Count), then convert to a collection before doing so (usually by calling the .ToList() extension method.

  • If you're exposing an enumeration, always expose it as a collection (ICollection<T> or IList<T>) if that is what your clients will normally use. For example, a data access layer will often return a collection of domain objects. Only expose IEnumerable<T> if deferred execution is a reasonable option for the API you're exposing.

Joe