tags:

views:

108

answers:

4

Hi folks,

I have an extension method called ToListIfNotNullOrEmpty(), which is hitting the DB twice, instead of once. The first time it returns one result, the second time it returns all the correct results.

I'm pretty sure the first time it hits the database, is when the .Any() method is getting called.

here's the code.

public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
    if (value.IsNullOrEmpty())
    {
        return null;
    }
    if (value is IList<T>)
    {
        return (value as IList<T>);
    }
    return new List<T>(value);
}

public static bool IsNullOrEmpty<T>(this IEnumerable<T> value)
{
    if (value != null)
    {
        return !value.Any();
    }
    return true;
}

I'm hoping to refactor it so that, before the .Any() method is called, it actually enumerates through the entire list.

If i do the following, only one DB call is made, because the list is already enumerated.

var pewPew = (from x in whatever
              select x)
             .ToList()   // This enumerates.
             .ToListIsNotNullOrEmpty();  // This checks the enumerated result.

I sorta don't really want to call ToList() then my extension method. Any ideas, folks?

A: 

You could stick the .ToList() call inside the extension, the effect is slightly different, but does this still work in the cases you have?

public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
    if(value == null)
    {
        return null;
    }
    var result = value.ToList();        
    return result.IsNullOrEmpty() ? null : result;
}
Nick Craver
+1  A: 

How about something like this?

public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
    if (value == null)
        return null;

    var list = value.ToList();
    return (list.Count > 0) ? list : null;
}
LukeH
I'd rename it ToListOrNull() in keeping with the existing FirstOrDefault(), SingleOrDefault() conventions.
tvanfosson
@tvanfosson: I probably wouldn't create an extension to do this in the first place. The `IEnumerable<T>` coming out of a LINQ query should never be null, so I'd just stick with a plain `ToList` call. Then you only need to test for the empty list rather than `null`.
LukeH
@Luke: personally, i disagree with your opinion in this case *blush*. For me if it's doesn't exist, it's null. i just like to go the next step and make sure that we have no collections that are empty. less things to sanity check. a collection either has some content or it's null .. not null/empty/has something. but like i said, this is just a religious war. :)
Pure.Krome
+3  A: 

I confess that I see little point in this method. Surely if you simply do a ToList(), a check to see if the list is empty suffices as well. It's arguably harder to handle the null result when you expect a list because then you always have to check for null before you iterate over it.

I think that:

 var query = (from ...).ToList();
 if (query.Count == 0) {
     ...
 }

works as well and is less burdensome than

var query = (from ...).ToListIfNotNullOrEmpty();
if (query == null) {
   ...
}

and you don't have to implement (and maintain) any code.

tvanfosson
I was just in the middle of editing my answer to say exactly this when your answer appeared, so +1. (Perhaps there's a reason why the OP needs to pass nulls around rather than empty lists, but in that case I think I'd prefer to make the conversion explicit and obvious rather than calling an extension method with a not-particularly-descriptive name.)
LukeH
yep guys, i just prefer to pass around null collections instead of empty collections.
Pure.Krome
I guess I prefer it the other way. Any method that I have that returns a collection typically returns a empty collection, never null. That way I don't have to check if it's null before I use it -- that would truly be an exceptional condition. I think the .NET folks would agree with me since all of the IEnumerable extensions return empty collections as well when there are no matches -- that's what allows them to be chained. I think, especially, if you're building extension methods on IEnumerable, you ought to follow that convention as well.
tvanfosson
Very valid point ... but I was always under the impression (maybe from using R# so much) that i should also always be checking for nulls .. even from IEnumerbale results... ???
Pure.Krome
@tvanfosson - is it fair to say that you assume all ienumerable objects are never null, then? (i'm really wracking my brain over this .. null or empty vs empty).
Pure.Krome
My expectation is that anything that returns an IEnumerable is going to return a non-null object, which may be empty. This is especially true of an extension method that does some transformation on an existing IEnumerable. I can't think of a single case of a method I've written that returns some sort of collection where I've chosen to return null rather than an empty collection if there was nothing in the collection. Any case that I can think of, like not being able to connect to the data source, where returning null might be appropriate would be better served by throwing an exception.
tvanfosson
+1, but query.Any() would be more efficient :)
Andras Zoltan
@Andras - Actually, since I've used ToList, I should simply be checking the property not invoking the IEnumerable extension `Count()`. That would surely be more efficient than calling `Any()` on the enumeration though, for a List, it should amount to the same thing. I've updated it to use the property -- I think it was just a typo before, but it's been long enough that I don't really remember.
tvanfosson
@tvanfosson - you're absolutely right; I completely missed the ToList() call!
Andras Zoltan
+1  A: 

To actually answer your question:

This method hits the database twice because the extension methods provided by the System.Linq.Enumerable class exhibit what is called deferred execution. Essentially, this is to eliminate the need for constructing a string of temporarily cached collections for every part of a query. To understand this, consider the following example:

var firstMaleTom = people
    .Where(p => p.Gender = Gender.Male)
    .Where(p => p.FirstName == "Tom")
    .FirstOrDefault();

Without deferred execution, the above code might require that the entire collection people be enumerated over, populating a temporary buffer array with all the individuals whose Gender is Male. Then it would need to be enumerated over again, populating another buffer array with all of the individuals from the first buffer whose first name is Tom. After all that work, the last part would return the first item from the resulting array.

That's a lot of pointless work. The idea with deferred execution is that the above code really just sets up the firstMaleTom variable with the information it needs to return what's being requested with the minimal amount of work.

Now, there's a flip side to this: in the case of querying a database, deferred execution means that the database gets queried when the return value is evaluated. So, in your IsNullOrEmpty method, when you call Any, the value parameter is actually being evaluated right then and there -- hence a database query. After this, in your ToListIfNotNullOrEmpty method, the line return new List<T>(value) also evaluates the value parameter -- because it's enumerating over the values and adding them to the newly created List<T>.

Dan Tao