ansaurus

Question

Why is this linq extension method hit the database twice?

Answer 1

A:

You could stick the .ToList() call inside the extension, the effect is slightly different, but does this still work in the cases you have?

public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
    if(value == null)
    {
        return null;
    }
    var result = value.ToList();        
    return result.IsNullOrEmpty() ? null : result;
}

Nick Craver 2010-02-13 14:00:44

Answer 2

+1 A:

How about something like this?

public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
    if (value == null)
        return null;

    var list = value.ToList();
    return (list.Count > 0) ? list : null;
}

LukeH 2010-02-13 14:06:22

I'd rename it ToListOrNull() in keeping with the existing FirstOrDefault(), SingleOrDefault() conventions.

tvanfosson 2010-02-13 14:08:21

@tvanfosson: I probably wouldn't create an extension to do this in the first place. The `IEnumerable<T>` coming out of a LINQ query should never be null, so I'd just stick with a plain `ToList` call. Then you only need to test for the empty list rather than `null`.

LukeH 2010-02-13 14:20:39

@Luke: personally, i disagree with your opinion in this case *blush*. For me if it's doesn't exist, it's null. i just like to go the next step and make sure that we have no collections that are empty. less things to sanity check. a collection either has some content or it's null .. not null/empty/has something. but like i said, this is just a religious war. :)

Pure.Krome 2010-02-13 21:56:18

Answer 3

+3 A:

I confess that I see little point in this method. Surely if you simply do a ToList(), a check to see if the list is empty suffices as well. It's arguably harder to handle the null result when you expect a list because then you always have to check for null before you iterate over it.

I think that:

 var query = (from ...).ToList();
 if (query.Count == 0) {
     ...
 }

works as well and is less burdensome than

var query = (from ...).ToListIfNotNullOrEmpty();
if (query == null) {
   ...
}

and you don't have to implement (and maintain) any code.

tvanfosson 2010-02-13 14:11:42

I was just in the middle of editing my answer to say exactly this when your answer appeared, so +1. (Perhaps there's a reason why the OP needs to pass nulls around rather than empty lists, but in that case I think I'd prefer to make the conversion explicit and obvious rather than calling an extension method with a not-particularly-descriptive name.)

LukeH 2010-02-13 14:28:40

yep guys, i just prefer to pass around null collections instead of empty collections.

Pure.Krome 2010-02-13 21:57:51

I guess I prefer it the other way. Any method that I have that returns a collection typically returns a empty collection, never null. That way I don't have to check if it's null before I use it -- that would truly be an exceptional condition. I think the .NET folks would agree with me since all of the IEnumerable extensions return empty collections as well when there are no matches -- that's what allows them to be chained. I think, especially, if you're building extension methods on IEnumerable, you ought to follow that convention as well.

tvanfosson 2010-02-13 22:23:40

Very valid point ... but I was always under the impression (maybe from using R# so much) that i should also always be checking for nulls .. even from IEnumerbale results... ???

Pure.Krome 2010-02-13 23:32:34

@tvanfosson - is it fair to say that you assume all ienumerable objects are never null, then? (i'm really wracking my brain over this .. null or empty vs empty).

Pure.Krome 2010-02-14 02:03:34

My expectation is that anything that returns an IEnumerable is going to return a non-null object, which may be empty. This is especially true of an extension method that does some transformation on an existing IEnumerable. I can't think of a single case of a method I've written that returns some sort of collection where I've chosen to return null rather than an empty collection if there was nothing in the collection. Any case that I can think of, like not being able to connect to the data source, where returning null might be appropriate would be better served by throwing an exception.

tvanfosson 2010-02-14 12:50:49

+1, but query.Any() would be more efficient :)

Andras Zoltan 2010-05-10 16:46:00

@Andras - Actually, since I've used ToList, I should simply be checking the property not invoking the IEnumerable extension `Count()`. That would surely be more efficient than calling `Any()` on the enumeration though, for a List, it should amount to the same thing. I've updated it to use the property -- I think it was just a typo before, but it's been long enough that I don't really remember.

tvanfosson 2010-05-10 17:06:38

@tvanfosson - you're absolutely right; I completely missed the ToList() call!

Andras Zoltan 2010-05-10 20:25:35

Answer 4

+1 A:

To actually answer your question:

This method hits the database twice because the extension methods provided by the System.Linq.Enumerable class exhibit what is called deferred execution. Essentially, this is to eliminate the need for constructing a string of temporarily cached collections for every part of a query. To understand this, consider the following example:

var firstMaleTom = people
    .Where(p => p.Gender = Gender.Male)
    .Where(p => p.FirstName == "Tom")
    .FirstOrDefault();

Without deferred execution, the above code might require that the entire collection people be enumerated over, populating a temporary buffer array with all the individuals whose Gender is Male. Then it would need to be enumerated over again, populating another buffer array with all of the individuals from the first buffer whose first name is Tom. After all that work, the last part would return the first item from the resulting array.

That's a lot of pointless work. The idea with deferred execution is that the above code really just sets up the firstMaleTom variable with the information it needs to return what's being requested with the minimal amount of work.

Now, there's a flip side to this: in the case of querying a database, deferred execution means that the database gets queried when the return value is evaluated. So, in your IsNullOrEmpty method, when you call Any, the value parameter is actually being evaluated right then and there -- hence a database query. After this, in your ToListIfNotNullOrEmpty method, the line return new List<T>(value) also evaluates the value parameter -- because it's enumerating over the values and adding them to the newly created List<T>.

Dan Tao 2010-05-10 17:22:36

ansaurus

tags:

views:

answers:

Why is this linq extension method hit the database twice?

related questions