views:

476

answers:

3

Working through a tutorial (Professional ASP.NET MVC - Nerd Dinner), I came across this snippet of code:

public IEnumerable<RuleViolation> GetRuleViolations() {
    if (String.IsNullOrEmpty(Title))
        yield return new RuleViolation("Title required", "Title");
    if (String.IsNullOrEmpty(Description))
        yield return new RuleViolation("Description required","Description");
    if (String.IsNullOrEmpty(HostedBy))
        yield return new RuleViolation("HostedBy required", "HostedBy");
    if (String.IsNullOrEmpty(Address))
        yield return new RuleViolation("Address required", "Address");
    if (String.IsNullOrEmpty(Country))
        yield return new RuleViolation("Country required", "Country");
    if (String.IsNullOrEmpty(ContactPhone))
        yield return new RuleViolation("Phone# required", "ContactPhone");
    if (!PhoneValidator.IsValidNumber(ContactPhone, Country))
        yield return new RuleViolation("Phone# does not match country", "ContactPhone");
    yield break;
}

I've read up on yield, but I guess my understanding is still a little bit hazy. What it seems to do is create an object that allows cycling through the items in a collection without actually doing the cycling unless and until it's absolutely necessary.

This example is a little strange to me, though. What I think it's doing is delaying the creation of any RuleViolation instances until the programmer actually requests a specific item in the collection using either for each or a LINQ extension method like .ElementAt(2).

Beyond this, though, I have some questions:

  1. When do the conditional parts of the if statements get evaluated? When GetRuleViolations() is called or when the enumerable is actually iterated? In other words, if the value of Title changes from null to Really Geeky Dinner between the time that I call GetRuleViolations() and the time I attempt to actually iterate over it, will RuleViolation("Title required", "Title") be created or not?

  2. Why is yield break; necessary? What is it really doing here?

  3. Let's say Title is null or empty. If I call GetRuleViolations() then iterate over the resulting enumerable two times in a row, how many times will new RuleViolation("Title required", "Title") be called?

+3  A: 

1) Take this simpler example:

public void Enumerate()
{
    foreach (var item in EnumerateItems())
    {
        Console.WriteLine(item);
    }
}

public IEnumerable<string> EnumerateItems()
{
    yield return "item1";
    yield return "item2";
    yield break;
}

Each time you call MoveNext() from the IEnumerator the code returns from the yield point and moves to the next executable line of code.

2) yield break; will tell the IEnumerator that there is nothing more to enumerate.

3) once per enumeration.

Using yield break;

public IEnumerable<string> EnumerateUntilEmpty()
{
    foreach (var name in nameList)
    {
        if (String.IsNullOrEmpty(name)) yield break;
        yield return name;
    }     
}
ChaosPandion
Sure about the "3) once"? As far as I understand this is reevaluated every time (see the other two answers here).
Benjamin Podszun
I clarified it.
ChaosPandion
If you make a list, why not iterate over it anyway? The `yield` way enables you not to explicitly construct a list. In your case for example: public IEnumerable<string> Enumerate() { yield return "item1"; yield return "item2"; }
Aviad P.
Thanks, that accomplishes what I originally wanted to show, a simple version of his example.
ChaosPandion
`yield break` is not necessary in your first example either. In your second example it is of course.
Aviad P.
+10  A: 

A function that contains yield commands is treated differently than a normal function. What is happening behind the scenes when that function is called, is that an anonymous type is constructed of the specific IEnumerable type of the function, the function creates an object of that type and returns it. The anonymous class contains logic that executes the body of the function up until the next yield command for every time the IEnumerable.MoveNext is called. It is a bit misleading, the body of the function is not executed in one batch like a normal function, but rather in pieces, each piece executes when the enumerator moves one step forward.

With regards to your questions:

  1. As I said, each if gets executed when you iterate to the next element.
  2. yield break is indeed not necessary in the example above. What it does is it terminates the enumeration.
  3. Each time you iterate over the enumerable, you force the execution of the code again. Put a breakpoint on the relevant line and test for yourself.
Aviad P.
+1 really clear and helpful answer, thanks. Just one minor followup question if you don't mind: Can you think of an example where `yield break` *is* necessary?
DanM
Minor clarification - you mean `IEnumerator[<T>]` in a few places where you say `IEnumerable[<T>]`.
Marc Gravell
So in the example the `yield break` is necessary, because if no `RuleViolation` is created, it would throw an exception, right?
Ronald
By "empty set" do you mean none of the `if` statements evaluate to true, or do you mean `yield break` is needed as a placeholder if I haven't yet added any `if` statements to my method? (Or both?)
DanM
`yield break` is not necessary here. It is only necessary in case you need to stop the enumeration somewhere inside the function body, before the execution reaches its natural end.
Aviad P.
@Aviad, this is true even if there are no if statements in the body of the method?
DanM
@Aviad, Ok -- looks like I made a mistake. You are correct.
Matt Brunell
Okay, but clearly, `private IEnumerable<string> Enumerate() { }` will get you a "not all code paths return a value error". So, I think this allows me to conclude the following: you need `yield break` either (1) as a placeholder if you have no other return statements (yield or otherwise) in your method or (2) "in case you need to stop the enumeration somewhere inside the function body, before the execution reaches its natural end."
DanM
Yes, the function body must contain the keyword `yield` for this to work. Otherwise the compiler would treat it as a normal function (read the first sentence in my answer again :-) )
Aviad P.
+1  A: 

Short version:

1: The yield is the magic "Stop and come back later" keyword, so the if statements in front of the "active" one have been evaluated.

2: yield break explicitly ends the enumeration (think "break" in a switch case)

3: Every time. You can cache the result, of course, by turning it into a List for example and iterating over that afterwards.

Benjamin Podszun