views:

2170

answers:

5
+11  Q: 

Do you ToList()?

Do you have a default type that you prefer to use in your dealings with the results of LINQ queries?

By default LINQ will return an IEnumerable<> or maybe an IOrderedEnumerable<>. We have found that a List<> is generally more useful to us, so have adopted a habit of ToList()ing our queries most of the time, and certainly using List<> in our function arguments and return values.

The only exception to this has been in LINQ to SQL where calling .ToList() would enumerate the IEnumerable prematurely.

We are also using WCF extensively, the default collection type of which is System.Array. We always change this to System.Collections.Generic.List in the Service Reference Settings dialog in VS2008 for consistency with the rest of our codebase.

What do you do?

+1  A: 

It depends if you need to modify the collection. I like to use an Array when I know that no one is going to add/delete items. I use a list when I need to sort/add/delete items. But, usually I just leave it as IEnumerable as long as I can.

joegtp
+8  A: 

ToList always evaluates the sequence immediately - not just in LINQ to SQL. If you want that, that's fine - but it's not always appropriate.

Personally I would try to avoid declaring that you return List<T> directly - usually IList<T> is more appropriate, and allows you to change to a different implementation later on. Of course, there are some operations which are only specified on List<T> itself... this sort of decision is always tricky.

EDIT: (I would have put this in a comment, but it would be too bulky.) Deferred execution allows you to deal with data sources which are too big to fit in memory. For instance, if you're processing log files - transforming them from one format to another, uploading them into a database, working out some stats, or something like that - you may very well be able to handle arbitrary amounts of data by streaming it, but you really don't want to suck everything into memory. This may not be a concern for your particular application, but it's something to bear in mind.

Jon Skeet
Agreed that ToList evaluates immediately. Our thinking is that in LINQtoSQL this is likely to have a performance impact (especially if we're chaining together a few LINQ expressions), but when we're in-memory any performance hit will be negligable - and consistency for the humans is more important.
Richard Ev
There are significant differences beyond performance. In particular, if anything about the query (e.g. the data in the source) changes, then deferred execution will give you a different answer. Sometimes that's what you want, sometimes it isn't.
Jon Skeet
Good point about deferred execution - it wasn't until ReSharper highlighted that in my code and I read up on it that I understood it fully.
Richard Ev
+7  A: 

We have the same scenario - WCF communications to a server, the server uses LINQtoSQL.

We use .ToArray() when requesting objects from the server, because it's "illegal" for the client to change the list. (Meaning, there is no purpose to support ".Add", ".Remove", etc).

While still on the server, however, I would recommend that you leave it as it's default (which is not IEnumerable, but rather IQueryable). This way, if you want to filter even more based on some criteria, the filtering is STILL on the SQL side until evaluated.

This is a very important point as it means incredible performance gains or losses depending on what you do.

EXAMPLE:

// This is just an example... imagine this is on the server only. It's the
// basic method that gets the list of clients.
private IEnumerable<Client> GetClients()
{
    var result = MyDataContext.Clients;  

    return result.AsEnumerable();
}

// This method here is actually called by the user...
public Client[] GetClientsForLoggedInUser()
{
    var clients = GetClients().Where(client=> client.Owner == currentUser);

    return clients.ToArray();
}

Do you see what's happening there? The "GetClients" method is going to force a download of ALL 'clients' from the database... THEN the Where clause will happen in the GetClientsForLoogedInUser method to filter it down.

Now, notice the slight change:

private IQueryable<Client> GetClients()
{
    var result = MyDataContext.Clients;  

    return result.AsQueryable();
}

Now, the actual evaluation won't happen until ".ToArray" is called... and SQL will do the filtering. MUCH better!

Timothy Khouri
Your point is a very distinct one. Most people seem to miss it though. I see really bad examples like this used all the time. People don't stop to think about what that cast is going to do at execution.
Jason Short
+4  A: 

In the Linq-to-Objects case, returning List<T> from a function isn't as nice as returning IList<T>, as THE VENERABLE SKEET points out. But often you can still do better than that. If the thing you are returning ought to be immutable, IList is a bad choice because it invites the caller to add or remove things.

For example, sometimes you have a method or property that returns the result of a Linq query or uses yield return to lazily generate a list, and then you realise that it would be better to do that the first time you're called, cache the result in a List<T> and return the cached version thereafter. That's when returning IList may be a bad idea, because the caller may modify the list for their own purposes, which will then corrupt your cache, making their changes visible to all other callers.

Better to return IEnumerable<T>, so all they have is forward iteration. And if the caller wants rapid random access, i.e. they wish they could use [] to access by index, they can use ElementAt, which Linq defines so that it quietly sniffs for IList and uses that if available, and if not it does the dumb linear lookup.

One thing I've used ToList for is when I've got a complex system of Linq expressions mixed with custom operators that use yield return to filter or transform lists. Stepping through in the debugger can get mighty confusing as it jumps around doing lazy evaluation, so I sometimes temporarily add a ToList() to a few places so that I can more easily follow the execution path. (Although if the things you are executing have side-effects, this can change the meaning of the program.)

Daniel Earwicker
A: 

If you don't need the added features of List<>, why not just stick with IQueryable<> ?!?!?! Lowest common denominator is the best solution (especially when you see Timothy's answer).

TheSoftwareJedi