views:

697

answers:

4

I have the following function to get validation errors for a card. My question relates to dealing with GetErrors. Both methods have the same return type IEnumerable<ErrorInfo>.

private static IEnumerable<ErrorInfo> GetErrors(Card card)
{
    var errors = GetMoreErrors(card);
    foreach (var e in errors)
        yield return e;

    // further yield returns for more validation errors
}

Is it possible to return all the errors in GetMoreErrors without having to enumerate through them?

Thinking about it this is probably a stupid question, but I want to make sure I'm not going wrong.

+1  A: 

The yield keyword is specifically designed to return one item from an IEnumerable collection each time you call the method.

If you want the entire collection at once, you can simply change the return type of the method and return the entire collection directly.

Something like this (assumes GetMoreErrors() returns a List<>):

private static List<ErrorInfo> GetErrors(Card card)
{
    return GetMoreErrors(card);
}

Yield is just a convenient way to have a method take care of the iteration for you... one piece retrieved each time you call GetErrors(). If you return the entire collection, you will probably end up iterating through the error collection anyway, so it's just a matter of where you want to see the foreach() code.

Robert Cartaino
+1  A: 

I don't see anything wrong with your function, I'd say that it is doing what you want.

Think of the Yield as returning an element in the final Enumeration each time that it is invoked, so when you have it in the foreach loop like that, each time it is invoked it returns 1 element. You have the ability to put conditional statements in your foreach to filter the resultset. (simply by not yielding on your exclusion criteria)

If you add subsequent yields later in the method, it will continue to add 1 element to the enumeration, making it possible to do things like...

public IEnumerable<string> ConcatLists(params IEnumerable<string>[] lists)
{
  foreach (IEnumerable<string> list in lists)
  {
    foreach (string s in list)
    {
      yield return s;
    }
  }
}
Tim Jarvis
+14  A: 

It's definitely not a stupid question, and it's something that F# supports with yield! for a whole collection vs yield for a single item. (That can be very useful in terms of tail recursion...)

Unfortunately it's not supported in C#.

However, if you have several methods each returning an IEnumerable<ErrorInfo>, you can use Enumerable.Concat to make your code simpler:

private static IEnumerable<ErrorInfo> GetErrors(Card card)
{
    return GetMoreErrors(card).Concat(GetOtherErrors())
                              .Concat(GetValidationErrors())
                              .Concat(AnyMoreErrors())
                              .Concat(ICantBelieveHowManyErrorsYouHave());
}

There's one very important difference between the two implementations though: this one will call all of the methods immediately, even though it will only use the returned iterators one at a time. Your existing code will wait until it's looped through everything in GetMoreErrors() before it even asks about the next errors.

Usually this isn't important, but it's worth understanding what's going to happen when.

Jon Skeet
Wes Dyer has an interesting article mentioning this pattern.http://blogs.msdn.com/wesdyer/archive/2007/03/23/all-about-iterators.aspx
JohannesH
Minor correction for passers by - it's System.Linq.Enumeration.Concat<>(first,second). Not IEnumeration.Concat().
locster
@the-locster: I'm not sure what you mean. It's definitely Enumerable rather than Enumeration. Could you clarify your comment?
Jon Skeet
@Jon Skeet - What exactly do you mean that it will call the methods immediately? I ran a test and it looks like it's deferring the method calls completely until something is actually iterated. Code here: http://pastebin.com/0kj5QtfD
Steven Oxley
@Steven: Nope. It's *calling* the methods - but in your case `GetOtherErrors()` (etc) are deferring their *results* (as they're implemented using iterator blocks). Try changing them to return a new array or something like that, and you'll see what I mean.
Jon Skeet
@Jon OK, I get it. I guess I overlooked the fact that an array implements IEnumerable as well, and that calling a method and getting the results of a method are two different things (I'm new to the idea of iterator blocks). Thanks for clarifying.
Steven Oxley
+1  A: 

Yes it is possible to return all errors at once. Just return a List<T> or ReadOnlyCollection<T>.

By returning an IEnumerable<T> you're returning a sequence of something. On the surface that may seem identical to returning the collection, but there are a number of difference, you should keep in mind.

Collections

  • The caller can be sure that both the collection and all the items will exist when the collection is returned. If the collection must be created per call, returning a collection is a really bad idea.
  • Most collections can be modified when returned.
  • The collection is of finite size.

Sequences

  • Can be enumerated - and that is pretty much all we can say for sure.
  • A returned sequence itself cannot be modified.
  • Each element may be created as part of running through the sequence (i.e. returning IEnumerable<T> allows for lazy evaluation, returning List<T> does not).
  • A sequence may be infinite and thus leave it to the caller to decide how many elements should be returned.
Brian Rasmussen
Returning a collection can result in unreasonable overhead if all the client really needs is to enumerate through it, since you allocate the data structures for all elements in advance. Also, if you delegate to another method that's returning a sequence, then capturing it as a collection involves extra copying, and you do not know how many items (and thus how much overhead) this may potentially involve. Thus, it is only a good idea to return collection when it is already there and can be returned directly without copying (or wrapped as readonly). In all other cases, sequence is a better choice
Pavel Minaev
I agree, and if you got the impression that I said returning a collection is always a good idea you missed my point. I was trying to highlight the fact that there are differences between returning a collection and returning a sequence. I will try to make it clearer.
Brian Rasmussen