ansaurus

Question

Best practices for dealing with LINQ statements that result in empty sequences and the like?

Answer 1

+19 A:

Use FirstOrDefault and then check for null.

RedFilter 2010-09-10 21:23:45

It's (slightly) more difficult than that when your elements are value types.

LukeH 2010-09-10 21:28:36

+1: And for value types (like int), check for 0, not null. But that's just me nitpickin.

jdv 2010-09-10 21:28:48

Yeah except the actual value could be 0, and there is no way to distinguish this.

luksan 2010-09-10 21:35:19

you could also use default(String) or default(int) to solve that problem...

Davy Landman 2010-09-10 21:35:31

I guess you could say I have the opposite approach: I like being able to easily include a unique value assertion into LINQ statements with Single() - I'd much rather handle an exception that have an assumption broken silently. So it chaps my hide that some link sources (notably EF) don't support Single().

Paul Keister 2010-09-10 21:36:25

Answer 2

+4 A:

Does anyone here understand my predicament?

Not really, if you replace First() with FirstOrDefault() your try/catch blocks can be replace with if(...) statements or strategically used && or || operators.

Henk Holterman 2010-09-10 21:25:28

@Henk thank you, I'll look into `FirstOrDefault()`.

Dave 2010-09-10 21:33:47

Answer 3

+31 A:

Use First when you know that there is one or more items in the collection. Use Single when you know that there is exactly one item in the collection. If you don't know those things, then don't use those methods. Use methods that do something else, like FirstOrDefault(), SingleOrDefault() and so on.

You could, for example, say:

int? first = sequence.Any() ? (int?) sequence.First() : (int?) null;

which is far less gross than

int? first = null;
try { first = sequence.First(); } catch { }

But still not great because it iterates the first item of the sequence twice. In this case I would say if there aren't sequence operators that do what you want then write your own.

Continuing with our example, suppose you have a sequence of integers and want to get the first item, or, if there isn't one, return null. There isn't a built-in sequence operator that does that, but it's easy to write it:

public static int? FirstOrNull(this IEnumerable<int> sequence)
{
    foreach(int item in sequence)
        return item;
    return null;
}

or even better:

public static T? FirstOrNull<T>(this IEnumerable<T> sequence) where T : struct
{
    foreach(T item in sequence)
        return item;
    return null;
}

or this:

struct Maybe<T>
{
    public T Item { get; private set; }
    public bool Valid { get; private set; }
    public Maybe(T item) : this() 
    { this.Item = item; this.Valid = true; }
}

public static Maybe<T> MyFirst<T>(this IEnumerable<T> sequence) 
{
    foreach(T item in sequence)
        return new Maybe(item);
    return default(Maybe<T>);
}
...
var first = sequence.MyFirst();
if (first.Valid) Console.WriteLine(first.Item);

But whatever you do, do not handle those exceptions you mentioned. Those exceptions are not meant to be handled, they are meant to tell you that you have bugs in your code. You shouldn't be handling them, you should be fixing the bugs. Putting try-catches around them is hiding bugs, not fixing bugs.

UPDATE:

Dave asks how to make a FirstOrNull that takes a predicate. Easy enough. You could do it like this:

public static T? FirstOrNull<T>(this IEnumerable<T> sequence, Func<T, bool> predicate) where T : struct
{
    foreach(T item in sequence)
        if (predicate(item)) return item;
    return null;
}

Or like this

public static T? FirstOrNull<T>(this IEnumerable<T> sequence, Func<T, bool> predicate) where T : struct
{
    foreach(T item in sequence.Where(predicate))
        return item;
    return null;
}

Or, don't even bother:

var first = sequence.Where(x=>whatever).FirstOrNull();

No reason why the predicate has to go on FirstOrNull. We provide a First() that takes a predicate as a convenience so that you don't have to type the extra "Where".

UPDATE: Dave asks another follow-up question which I think might be "what if I want to say sequence.FirstOrNull().Frob().Blah().Whatever() but any one of those along the line could return null?"

We have considered adding a null-propagating member-access operator to C#, tentatively notated as .? -- that is, you could say

x = a.?b.?c.?d;

and if a, b, or c returned null, then the result would be to assign null to x.

Obviously we did not actually implement it for C# 4. It is a possible work item for hypothetical future versions of the language, but not very high priority, so I wouldn't get my hopes up.

(Remember, all of Eric's musings about hypothetical features of unannounced products that do not exist and might never ship and maybe don't even have anyone working on them at all are for entertainment purposes only.)

Note that C# does have a null coalescing operator:

(sequence.FirstOrNull() ?? GetDefault()).Frob().Blah().Whatever()

means "If FirstOrNull returns non-null use it as the receiver of Frob, otherwise call GetDefault and use that as the receiver". An alternative approach would be to again, write your own:

public static T FirstOrLazy<T>(this IEnumerable<T> sequence, Func<T> lazy) 
{
    foreach(T item in sequence)
        return item;
    return lazy();
}

sequence.FirstOrLazy(()=>GetDefault()).Frob().Blah().Whatever();

Now you get the first item if there is one, or the result of a call to GetDefault() if there is not.

Eric Lippert 2010-09-10 21:27:21

@Eric ok, so I think the take home message for me is that I can't just use LINQ as my first step in getting data out of my shared memory structure. I have two threads that run and look into this shared object and pull out information that they care about. I was just letting them query willy-nilly, since it seemed like a reasonable (read *cleaner*) way to code it up. But now it sounds like I'll have to roll something a little custom, and that's okay with me. Thanks!

Dave 2010-09-10 21:41:05

I'm sorry, but are those supposed to be `yield return`?

Steven Sudit 2010-09-10 21:43:15

@Dave: Alternately, it means you can use LINQ to handle the normal case, but if you want to know why it threw an exception, then your catch block might need to break down the original query into parts that can be executed in sequence and tested in between.

Steven Sudit 2010-09-10 21:44:47

@Steven I don't think so, because Eric doesn't want to return another sequence, he just wants to return the first. But what I don't see here is how this extension method can take the equivalent of a Where clause... I usually use First with a Where clause, something like this: `var temp = MyStuff.First( p => p.Name == MyName);`

Dave 2010-09-10 21:46:36

@Steven: No... why would they be? The first item of a sequence is not a sequence.

Eric Lippert 2010-09-10 21:46:49

@Steven thanks for the tip. I might look into that later. So far, I've been lucky and all of my queries seem to work, except when what I'm looking for isn't there. :) hmm... that sounds kinda wrong.

Dave 2010-09-10 21:47:40

Ah, got it. I misunderstood the purpose of the `foreach`. Thanks.

Steven Sudit 2010-09-10 21:49:33

Eric, I thought your generic FirstOrNull had a bug. Shouldn't it be for(T item in sequence) ??

SolutionYogi 2010-09-10 21:53:34

@Eric: About your error-handling advice, would you disapprove of catching the error for the purpose of logging it and returning a failure? After all, it may well not be a programming error so much as bad data, and as such, it's not necessarily cause to shut down the app.

Steven Sudit 2010-09-10 21:57:43

@Steven: Actions have consequences. Your question is essentially "when something unexpected happens due to a bug caused by a violated assumption is it better to fail fast, fail slow, or keep running and hope for the best?" It depends on what the consequences of each of those actions is. Computer programs often have human life safety implications, financial implications, and so on. Computers run life support equipment, trading floors, and factory robots. Sometimes trying to keep going is very important, sometimes *stopping immediately before you make it worse* is very important.

Eric Lippert 2010-09-10 22:02:32

@SolutionYogi yes that was a typo on Eric's part, but the point he was trying to make was clear enough.@Eric you're very correct. I am controlling machines, and there are cases where errors are okay and you can keep going, but much of the time in doing so you will end up with totally invalid results.

Dave 2010-09-10 22:04:23

@Eric man, this is good stuff. I need to play with this some more. Very good suggestions.

Dave 2010-09-10 22:07:05

@Eric: Thanks for the clarification. I do agree that *just* catching it is never acceptable, but I don't lean as far towards fail-fast as you do.

Steven Sudit 2010-09-10 22:07:43

@Dave: Indeed, you don't want to spill a thousand gallons of acid or pudding or whatever if the controller software throws an exception while the robot arm motor is turned on. The right thing to do is probably to catch the unexpected exception and immediately go into the "fail to the safest possible mode" subroutine that stops the robot safely. *Then* log the exception to disk.

Eric Lippert 2010-09-10 22:07:52

@Steven: for example, in the compiler if we get an unexpected exception we know it is unlikely to have human-life safety implications, but we also know that we are very unlikely to be able to generate correct code. So we handle the error by telling the user what went wrong, giving some diagnostics that they can report to the compiler team, and activating the Watson-phone-home system to allow the user to report the error automatically. Failing relatively slowly is the right thing to do for us.

Eric Lippert 2010-09-10 22:09:41

@Eric: That's a good example. I can think of a number of occasions where my own intuitions about what constitutes an error condition, even a fatal one, have turned out not to match business needs, so there's a lot to be said for doing full analysis before committing to any particular approach. Oh, and about `.?`, that would make some people *very* happy, particularly those who spend half of their time doing SQL.

Steven Sudit 2010-09-10 22:24:31

Answer 4

A:

The FirstOrDefault and SingleOrDefault operators solve your problem.

A similar problem I've encountered is when a collection contains a collection; a nested list. In that case I often use the null coalescing operator to allow a single line retrieval through the nested list. The most trivial case looks like this:

var nestedList = new List<List<int>>();
int? first = (nestedList.FirstOrDefault() ?? new List<int>).FirstOrDefault();

So if the outer list is empty, a new empty list is returned which simply allows the final FirstOrDefault to return a null.

Kirk Broadhurst 2010-09-12 13:11:53

ansaurus

tags:

views:

answers:

Best practices for dealing with LINQ statements that result in empty sequences and the like?

related questions