views:

771

answers:

10

still trying to find where i would use the "yield" keyword in a real situation.

I see this thread on the subject

http://stackoverflow.com/questions/39476/what-is-the-yield-keyword-used-for-in-c

but in the accepted answer, they have this as an example where someone is iterating around Integers()

public IEnumerable<int> Integers()
{
yield return 1;
yield return 2;
yield return 4;
yield return 8;
yield return 16;
yield return 16777216;
}

but why not just use

list<int>

here instead. seems more straightforward..

A: 

You might want to iterate through various collections:

public IEnumerable<ICustomer> Customers()
{
        foreach( ICustomer customer in m_maleCustomers )
        {
            yield return customer;
        }

        foreach( ICustomer customer in m_femaleCustomers )
        {
            yield return customer;
        }

        // or add some constraints...
        foreach( ICustomer customer in m_customers )
        {
            if( customer.Age < 16 )
            {
                yield return customer;
            }
        }

        // Or....            
        if( Date.Today == 1 )
        {
            yield return m_superCustomer;
        }

}
Trap
If you're interested (and unaware of Linq), you can write that whole thing as: return m_maleCustomers.Concat(m_femaleCustomers).Concat(m_customers.Where(c => c.Age < 16)).Concat(Enumerable.Repeat(m_superCustomer, 1).Where(Date.Today == 1);
Daniel Earwicker
+15  A: 

If you build and return a List (say it has 1 million elements), that's a big chunk of memory, and also of work to create it.

Sometimes the caller may only want to know what the first element is. Or they might want to write them to a file as they get them, rather than building the whole list in memory and then writing it to a file.

That's why it makes more sense to use yield return. It doesn't look that different to building the whole list and returning it, but it's very different because the whole list doesn't have to be created in memory before the caller can look at the first item on it.

When the caller says:

foreach (int i in Integers())
{
   // do something with i
}

Each time the loop requires a new i, it runs a bit more of the code in Integers(). The code in that function is "paused" when it hits a yield return statement.

Daniel Earwicker
I'd like to have comments from the two people who downvoted this - just curious to know what suggestions they have.
Daniel Earwicker
I was with problems to understand yield. But your answer was nice! I think the use of yield more or less like the difference between use DataReader and DataSets. With DataSets we got all the data then we work it and DataReaders you can work with the data while it is arriving from source. :-)
Click Ok
+3  A: 

You can use yield to build any iterator. That could be a lazily evaluated series (reading lines from a file or database, for example, without reading everything at once, which could be too much to hold in memory), or could be iterating over existing data such as a List<T>.

C# in Depth has a free chapter (6) all about iterator blocks.

I also blogged very recently about using yield for smart brute-force algorithms.

For an example of the lazy file reader:

    static IEnumerable<string> ReadLines(string path) {
        using (StreamReader reader = File.OpenText(path)) {
            string line;
            while ((line = reader.ReadLine()) != null) {
                yield return line;
            }
        }
    }

This is entirely "lazy"; nothing is read until you start enumerating, and only a single line is ever held in memory.

Note that LINQ-to-Objects makes extensive use of iterator blocks (yield). For example, the Where extension is essentially:

   static IEnumerable<T> Where<T>(this IEnumerable<T> data, Func<T, bool> predicate) {
        foreach (T item in data) {
            if (predicate(item)) yield return item;
        }
    }

And again, fully lazy - allowing you to chain together multiple operations without forcing everything to be loaded into memory.

Marc Gravell
+5  A: 

Yield allows you to build methods that produce data without having to gather everything up before returning. Think of it as returning multiple values along the way.

Here's a couple of methods that illustrate the point

public IEnumerable<String> LinesFromFile(String fileName)
{
    using (StreamReader reader = new StreamReader(fileName))
    {
        String line;
        while ((line = reader.ReadLine()) != null)
            yield return line;
    }
}

public IEnumerable<String> LinesWithEmails(IEnumerable<String> lines)
{
    foreach (String line in lines)
    {
        if (line.Contains("@"))
            yield return line;
    }
}

Neither of these two methods will read the whole contents of the file into memory, yet you can use them like this:

foreach (String lineWithEmail in LinesWithEmails(LinesFromFile("test.txt")))
    Console.Out.WriteLine(lineWithEmail);
Lasse V. Karlsen
+3  A: 

yield allows you to process collections that are potentially infinite in size because the entire collection is never loaded into memory in one go, unlike a List based approach. For instance an IEnumerable<> of all the prime numbers could be backed off by the appropriate algo for finding the primes, whereas a List approach would always be finite in size and therefore incomplete. In this example, using yield also allows processing for the next element to be deferred until it is required.

spender
+1  A: 

A real situation for me, is when i want to process a collection that takes a while to populate more smoothly.

Imagine something along the lines (psuedo code):

public IEnumberable<VerboseUserInfo> GetAllUsers()
{
    foreach(UserId in userLookupList)
    {
     VerboseUserInfo info = new VerboseUserInfo();

     info.Load(ActiveDirectory.GetLotsOfUserData(UserId));
     info.Load(WebSerice.GetSomeMoreInfo(UserId));

     yield return info;
    }
}

Instead of having to wait a minute for the collection to populate before i can start processing items in it. I will be able to start immediately, and then report back to the user-interface as it happens.

A: 

I agree with everything everyone has said here about lazy evaluation and memory usage and wanted to add another scenario where I have found the iterators using the yield keyword useful. I have run into some cases where I have to do a sequence of potentially expensive processing on some data where it is extremely useful to use iterators. Rather than processing the entire file immediately, or rolling my own processing pipeline, I can simply use iterators something like this:

IEnumerable<double> GetListFromFile(int idxItem)
{
    // read data from file
    return dataReadFromFile;
}

IEnumerable<double> ConvertUnits(IEnumerable<double> items)
{
    foreach(double item in items)
        yield return convertUnits(item);
}

IEnumerable<double> DoExpensiveProcessing(IEnumerable<double> items)
{
    foreach(double item in items)
        yield return expensiveProcessing(item);
}

IEnumerable<double> GetNextList()
{
    return DoExpensiveProcessing(ConvertUnits(GetListFromFile(curIdx++)));
}

The advantage here is that by keeping the input and output to all of the functions IEnumerable<double>, my processing pipeline is completely composable, easy to read, and lazy evaluated so I only have to do the processing I really need to do. This lets me put almost all of my processing in the GUI thread without impacting responsiveness so I don't have to worry about any threading issues.

Jon Norton
A: 

You may not always want to use yield instead of returning a list, and in your example you use yield to actually return a list of integers. Depending on whether you want a mutable list, or a immutable sequence, you could use a list, or an iterator (or some other collection muttable/immutable).

But there are benefits to use yield.

  • Yield provides an easy way to build lazy evaluated iterators. (Meaning only the code to get next element in sequence is executed when the MoveNext() method is called then the iterator returns doing no more computations, until the method is called again)

  • Yield builds a state machine under the covers, and this saves you allot of work by not having to code the states of your generic generator => more concise/simple code.

  • Yield automatically builds optimized and thread safe iterators, sparing you the details on how to build them.

  • Yield is much more powerful than it seems at first sight and can be used for much more than just building simple iterators, check out this video to see Jeffrey Richter and his AsyncEnumerator and how yield is used make coding using the async pattern easy.

Pop Catalin
A: 

I came up with this to overcome .net shortcoming having to manually deep copy List.

I use this:

static public IEnumerable<SpotPlacement> CloneList(List<SpotPlacement> spotPlacements)
{
    foreach (SpotPlacement sp in spotPlacements)
    {
        yield return (SpotPlacement)sp.Clone();
    }
}

And at another place:

public object Clone()
{
    OrderItem newOrderItem = new OrderItem();
    ...
    newOrderItem._exactPlacements.AddRange(SpotPlacement.CloneList(_exactPlacements));
    ...
    return newOrderItem;
}

I tried to come up with oneliner that does this, but it's not possible, due to yield not working inside anonymous method blocks.

EDIT:

Better still, use generic List cloner:

class Utility<T> where T : ICloneable
{
    static public IEnumerable<T> CloneList(List<T> tl)
    {
        foreach (T t in tl)
        {
            yield return (T)t.Clone();
        }
    }
}
Daniel Mošmondor
A: 

The method used by yield of saving memory by processing items on-the-fly is nice, but really it's just syntactic sugar. It's been around for a long time. In any language that has function or interface pointers (even C and assembly) you can get the same effect using a callback function / interface.

This fancy stuff:

static IEnumerable<string> GetItems()
{
    yield return "apple";
    yield return "orange";
    yield return "pear";
}

foreach(string item in GetItems())
{
    Console.WriteLine(item);
}

is basically equivalent to old-fashioned:

interface ItemProcessor
{
    void ProcessItem(string s);
};

class MyItemProcessor : ItemProcessor
{
    public void ProcessItem(string s)
    {
        Console.WriteLine(s);
    }
};

static void ProcessItems(ItemProcessor processor)
{
    processor.ProcessItem("apple");
    processor.ProcessItem("orange");
    processor.ProcessItem("pear");
}

ProcessItems(new MyItemProcessor());
Matthew