tags:

views:

1685

answers:

16

When ever I think I can use the yield keyword, I take a step back and look at how it will impact my project. I always end up returning a collection instead of yeilding because I feel the overhead of maintaining the state of the yeilding method doesn't buy me much. In almost all cases where I am returning a collection I feel that 90% of the time, the calling method will be iterating over all elements in the collection, or will be seeking a series of elements throughout the entire collection.

I do understand its usefulness in linq, but I feel that only the linq team is writing such complex queriable objects that yield is useful.

Has anyone written anything like or not like linq where yield was useful?

+18  A: 

Note that with yield, you are iterating over the collection once, but when you build a list, you'll be iterating over it twice.

Take, for example, a filter iterator:

IEnumerator<T>  Filter(this IEnumerator<T> coll, Func<T, bool> func)
{
     foreach(T t in coll)
        if (func(t))  yield return t;
}

Now, you can chain this:

 MyColl.Filter(x=> x.id > 100).Filter(x => x.val < 200).Filter (etc)

You method would be creating (and tossing) three lists. My method iterates over it just once.

Also, when you return a collection, you are forcing a particular implementation on you users. An iterator is more generic.

James Curran
Shouldn't that read:if (func(t)) yield return t;
Winston Smith
Ya'know... Without Intellisense, I just can't program.....
James Curran
:) I've got a VS2008 project that contains nothing but SO code snippets...
GalacticCowboy
Wouldn't performing that filter be more straight forward with linq though?
Bob
That filter is basically what the LINQ Where extension method is.
Thedric Walker
That is my point though, I think it would be more straight forward to use linq, would you ever write that filtering code instead of using linq? What benifits would you get?
Bob
A: 

I often use yield when generating data for reports.

Aaron Palmer
+1  A: 

Personnally, I haven't found I'm using yield in my normal day-to-day programming. However, I've recently started playing with the Robotics Studio samples and found that yield is used extensively there, so I also see it being used in conjunction with the CCR (Concurrency and Coordination Runtime) where you have async and concurrency issues.

Anyway, still trying to get my head around it as well.

Harrison
+1  A: 

Yield is useful because it saves you space. Most optimizations in programming makes a trade off between space (disk, memory, networking) and processing. Yield as a programming construct allows you to iterate over a collection many times in sequence without needing a separate copy of the collection for each iteration.

consider this example:

static IEnumerable<Person> GetAllPeople()
{
    return new List<Person>()
    {
        new Person() { Name = "George", Surname = "Bush", City = "Washington" },
        new Person() { Name = "Abraham", Surname = "Lincoln", City = "Washington" },
        new Person() { Name = "Joe", Surname = "Average", City = "New York" }
    };
}

static IEnumerable<Person> GetPeopleFrom(this IEnumerable<Person> people,  string where)
{
    foreach (var person in people)
    {
        if (person.City == where) yield return person;
    }
    yield break;
}

static IEnumerable<Person> GetPeopleWithInitial(this IEnumerable<Person> people, string initial)
{
    foreach (var person in people)
    {
        if (person.Name.StartsWith(initial)) yield return person;
    }
    yield break;
}

static void Main(string[] args)
{
    var people = GetAllPeople();
    foreach (var p in people.GetPeopleFrom("Washington"))
    {
        // do something with washingtonites
    }

    foreach (var p in people.GetPeopleWithInitial("G"))
    {
        // do something with people with initial G
    }

    foreach (var p in people.GetPeopleWithInitial("P").GetPeopleFrom("New York"))
    {
        // etc
    }
}

(Obviously you are not required to use yield with extension methods, it just creates a powerful paradigm to think about data.)

As you can see, if you have a lot of these "filter" methods (but it can be any kind of method that does some work on a list of people) you can chain many of them together without requiring extra storage space for each step. This is one way of raising the programming language (C#) up to express your solutions better.

The first side-effect of yield is that it delays execution of the filtering logic until you actually require it. If you therefore create a variable of type IEnumerable<> (with yields) but never iterate through it, you never execute the logic or consume the space which is a powerful and free optimization.

The other side-effect is that yield operates on the lowest common collection interface (IEnumerable<>) which enables the creation of library-like code with wide applicability.

Pieter Breed
All of those are *really* just LINQ though. If you're using .NET 3.5, you'd surely implement GetPeopleWithInitial by returning people.Where(person => person.Name.StartsWith(initial)).
Jon Skeet
well, yes and no. What you are saying is true, but you will have to person => person.Name.Startswith() everywhere. With a library method you get the obvious benefits... yield also comes in .NET 2 whereas not everybody has .NET 3.5 yet...
Pieter Breed
Pieter: I'm not saying you should remove the library methods, but I'd normally implement them with LINQ. And when it's so close to being LINQ, it doesn't really feel like an answer to "when is yield useful outside LINQ" - reimplementing LINQ yourself doesn't count, IMO :)
Jon Skeet
you don't need the yield break since its the last line of the method
Scott Cowan
+2  A: 

Whenever your function returns IEnumerable you should use "yielding". Not in .Net > 3.0 only.

.Net 2.0 example:

  public static class FuncUtils
  {
      public delegate T Func<T>();
      public delegate T Func<A0, T>(A0 arg0);
      public delegate T Func<A0, A1, T>(A0 arg0, A1 arg1);
      ... 

      public static IEnumerable<T> Filter<T>(IEnumerable<T> e, Func<T, bool> filterFunc)
      {
          foreach (T el in e)
              if (filterFunc(el)) 
                  yield return el;
      }


      public static IEnumerable<R> Map<T, R>(IEnumerable<T> e, Func<T, R> mapFunc)
      {
          foreach (T el in e) 
              yield return mapFunc(el);
      }
        ...
macropas
+7  A: 

At a previous company, I found myself writing loops like this:

for (DateTime date = schedule.StartDate; date <= date.EndDate; 
     date = date.AddDays(1))

With a very simple iterator block, I was able to change this to:

foreach (DateTime date in schedule.DateRange)

It made the code a lot easier to read, IMO.

Jon Skeet
Wow - Jon Skeet code I don't agree with! =X From the first example it's obvious that you're iterating over days, but that clarity is missing in the second. I'd use something like 'schedule.DateRange.Days()' to avoid ambiguity.
Erik Forbes
That would require more than just implementing a single property, of course. I'd say that it's obvious that a DateRange is a range of dates, which are days, but it's a subjective thing. It might have been called "Dates" rather than DateRange - not sure. Either way, it's less fluff than the original.
Jon Skeet
Yeah, that's true. *shrugs* I wouldn't personally be satisfied with it, but if it's clear to the author and any future maintainers, then it doesn't really matter.
Erik Forbes
Also, I am just splitting hairs - your example demonstrates the usefulness of iterator blocks, and that's what's important for this question. Sorry to nit-pick. =X
Erik Forbes
Nitpicking is fine, splitting hairs is good, comments and suggestions on coding style are always welcome :)
Jon Skeet
+2  A: 

I'm not sure about C#'s implementation of yield(), but on dynamic languages, it's far more efficient than creating the whole collection. on many cases, it makes it easy to work with datasets much bigger than RAM.

Javier
A: 

I've used yeild in non-linq code things like this (assuming functions do not live in same class):

public IEnumerable<string> GetData()
{
 foreach(String name in _someInternalDataCollection)
 {
  yield return name;
 }
}

...

public void DoSomething()
{
 foreach(String value in GetData())
 {
  //... Do something with value that doesn't modify _someInternalDataCollection
 }
}

You have to be careful not to inadvertently modify the collection that your GetData() function is iterating over though, or it will throw an exception.

Anton
+4  A: 

yield was developed for C#2 (before Linq in C#3).

We used it heavily in a large enterprise C#2 web application when dealing with data access and heavily repeated calculations.

Collections are great any time you have a few elements that you're going to hit multiple times.

However in lots of data access scenarios you have large numbers of elements that you don't necessarily need to pass round in a great big collection.

This is essentially what the SqlDataReader does - it's a forward only custom enumerator.

What yield lets you do is quickly and with minimal code write your own custom enumerators.

Everything yield does could be done in C#1 - it just took reams of code to do it.

Linq really maximises the value of the yield behaviour, but it certainly isn't the only application.

Keith
+5  A: 

I recently had to make a representation of mathematical expressions in the form of an Expression class. When evaluating the expression I have to traverse the tree structure with a post-order treewalk. To achieve this I implemented IEnumerable<T> like this:

public IEnumerator<Expression<T>> GetEnumerator()
{
    if (IsLeaf)
    {
        yield return this;
    }
    else
    {
        foreach (Expression<T> expr in LeftExpression)
        {
            yield return expr;
        }
        foreach (Expression<T> expr in RightExpression)
        {
            yield return expr;
        }
        yield return this;
    }
}

Then I can simply use a foreach to traverse the expression. You can also add a Property to change the traversal algorithm as needed.

Morten Christiansen
C# really needs a yieldcollection keyword to abstract out the foreach(x in collection){ yield x } loops that everyone writes 100x a day these days :-(
Orion Edwards
if you are just doing foreach(x in collection) {yield return x;} ... you can just do .Select(x=>x). if you want to do work against a set of items in a collection you can make an extension method .Foreach<T, TResult>(IEnumerable<T> col, Action<T, TResult> action)
Matthew Whited
A: 

Note that yield allows you to do things in a "lazy" way. By lazy, I mean that the evaluation of the next element in the IEnumberable is not done until the element is actually requested. This allows you the power to do a couple of different things. One is that you could yield an infinitely long list without the need to actually make infinite calculations. Second, you could return an enumeration of function applications. The functions would only be applied when you iterate through the list.

Logicalmind
A: 

Yield is very useful in general. It's in ruby among other languages that support functional style programming, so its like it's tied to linq. It's more the other way around, that linq is functional in style, so it uses yield.

I had a problem where my program was using a lot of cpu in some background tasks. What I really wanted was to still be able to write functions like normal, so that I could easily read them (i.e. the whole threading vs. event based argument). And still be able to break the functions up if they took too much cpu. Yield is perfect for this. I wrote a blog post about this and the source is available for all to grok :)

Anders Rune Jensen
+5  A: 

I do understand its usefulness in linq, but I feel that only the linq team is writing such complex queriable objects that yield is useful.

Yield was useful as soon as it got implemented in .NET 2.0, which was long before anyone ever thought of LINQ.

Why would I write this function:

IList<string> LoadStuff() {
  var ret = new List<string>();
  foreach(var x in SomeExternalResource)
    ret.Add(x);
  return ret;
}

When I can use yield, and save the effort and complexity of creating a temporary list for no good reason:

IEnumerable<string> LoadStuff() {
  foreach(var x in SomeExternalResource)
    yield return x;
}

It can also have huge performance advantages. If your code only happens to use the first 5 elements of the collection, then using yield will often avoid the effort of loading anything past that point. If you build a collection then return it, you waste a ton of time and space loading things you'll never need.

I could go on and on....

Orion Edwards
A: 

I am a huge Yield fan in C#. This is especially true in large homegrown frameworks where often methods or properties return List that is a sub-set of another IEnumerable. The benefits that I see are:

  • the return value of a method that uses yield is immutable
  • you are only iterating over the list once
  • it a late or lazy execution variable, meaning the code to return the values are not executed until needed (though this can bite you if you dont know what your doing)
  • of the source list changes, you dont have to call to get another IEnumerable, you just iterate over IEnumeable again
  • many more

One other HUGE benefit of yield is when your method potentially will return millions of values. So many that there is the potential of running out of memory just building the List before the method can even return it. With yield, the method can just create and return millions of values, and as long the caller also doesnt store every value. So its good for large scale data processing / aggregating operations

Turbo
A: 

very interesting usage of yield's is in Jeffrey Richter's Power Threading Library

Yurec
A: 

The System.Linq IEnumerable extensions are great, but sometime you want more. For example, consider the following extension:

public static class CollectionSampling
{
    public static IEnumerable<T> Sample<T>(this IEnumerable<T> coll, int max)
    {
        var rand = new Random();
        using (var enumerator = coll.GetEnumerator());
        {
            while (enumerator.MoveNext())
            {
                yield return enumerator.Current; 
                int currentSample = rand.Next(max);
                for (int i = 1; i <= currentSample; i++)
                    enumerator.MoveNext();
            }
        }
    }    
}

Another interesting advantage of yielding is that the caller cannot cast the return value to the original collection type and modify your internal collection

ohadsc