views:

145

answers:

5

I'm looking for rules of thumb for calling ToList/ToArray/MemoizeAll(Rx) on IEnumerables, as opposed to returning the query itself when returning IEnumerable of something.

Often I find that it is better to just return the query and let the caller decide whether a list is needed or not, but sometimes it can come back and bite you in the rear due to the lazy nature of linq.

I want to collect guidelines such as:

Call ToList if:

  • you create new objects (eg. in a select)
  • you have side effects in your query

Otherwise, return the query

+3  A: 

Return ToList if:

  • You don't want or care for lazy query evaluation.

Edit:

Also, return ToList if:

  • You are using some kind of Linq to SQL framework (LLBLGen, EF, etc.), and you need to do an operation on the list that cannot be translated into SQL by the framework.
kbrimington
+1 for the edit. Especially if subsequent operations would cause database to be hit multiple times by the same query. Use SQL server profiler to determine where to use `ToList()`.
Necros
+1  A: 

you ToList() when you want a list of objects for your result.

Muad'Dib
+1: I don't think it can be any better than this as a general guideline.
Lasse V. Karlsen
@Lasse: I do. This answer does not explain *when* we would want a `List<T>`.
Steven Sudit
Because there's a million use-cases where you want to use a `List<T>` and a million use-cases (that look oddly the same) where you don't want to use it. It's entirely specific to the scenario at hand. What is the guideline for when to use a variable? What is the guideline for when to use a string? You can't really create a guideline that says so. Now, I agree, you can describe the features of a list, and then say "when you need any of this, you want to use a list", but he's asking for guidelines on *returning data*. You can't, because whether you want to use a list or not is not up to you.
Lasse V. Karlsen
+1  A: 

Use ToList if you need to run custom functions on data returned by LINQ to SQL.

Jens
+7  A: 

First off, you should NEVER have side effects in a query. That is a worst practice. Queries should answer a question, not produce an effect.

The answer to your question is: return a query when the caller expects a query; return a list when the caller expects a list. When you design your method, decide what the caller is more likely to want, implement that, and then document it.

When considering whether the caller wants a query or a list, think about the differences between queries and lists:

  • queries are always up-to-date. If the objects/databases/whatever that the query queries against changes its content, then the query results will change if you run the query again. Lists don't change their contents and therefore lists get out of date. If your caller requires the latest data then give them a query. If they require a snapshot of the data that they can inspect at leisure then give them a list.

  • queries are potentially expensive to execute to obtain their results. Lists are cheap to obtain their results. If the caller is likely to want to interrogate the result many times and expects to get the same results each time then give them a list.

  • Constructing a query is fast. Executing a query to construct a list is slow. A list always obtains all the results of a query. The caller might want to further restrict the query, by, say, taking only the first ten elements. If the caller does not want or need to take on the expense of fully iterating over the entire query then give them a query; don't make that decision on their behalf and give them a list.

  • queries are tiny. Lists are big. Many queries can be iterated over n items in O(1) space; a list with n items takes up O(n) space. If the result set is enormous then putting it in a list is probably inefficient.

  • and so on.

There is no easy answer. The answer is the same as the answer to any other design problem: Consider all the pros and cons of each possible solution in the context of what is most likely wanted by the user of the feature, and then pick a reasonable compromise solution.

Eric Lippert
+2  A: 

Use ToList before you exit the using block that holds your DataContext.

Return a query when the caller is likely/obligated to supply additional filtering criteria which will be used by indexes to reduce # of result rows and/or database IO.

David B