views:

655

answers:

5

I've got a tree-like structure. Each element in this structure should be able to return a Enumerable of all elements it is root to. Let's call this method IEnumerable<Foo> GetAll(). So if we have

      A <-- topmost root
    /   \
   B     C
  / \   / \
  D  E  F  G

a call to GetAll on element C returns {C, F, G} (fixed order of elements would be nice, but is not needed). I guess everybody knew that already.

The current implementation of GetAll looks like this:

public IEnumerable<Foo> GetAll ()
{
 yield return this;

 foreach (Foo foo in MyChildren) {
  foreach (Foo f in foo.GetAll ()) {
   yield return f;
  }
 }
}

In an earlier implementation, I returned a List and added the child-foos using List.AddRange().

My question is if the version using yield is correcly implemented or if it should be improved (esp. in terms of performance). Or is this just bad and I should stick to Lists (or ReadOnlyCollections) instead?

A: 

No, that looks fine.

Have a look at my blog entry, it may be of some use :)

leppie
A: 

According to my prior experiences using yield is a lot more effective than creating a List. If you're using .NET 3.5, this implementation should be fine. But don't forget the

yield break;

At the end. :-)

ShdNx
Um, why would you want yield break at the end in this case?
Jon Skeet
Why do you need this at the end? I thought that the enumerator automatically finished when the Enumerable method exited ...
Bevan
Hmm, perhaps I misunderstood something regarding the use of yield. As I remember I got an error if I didn't close the method with yield break;. I'm sorry if I said something stupid! Gonna look into that matter...
ShdNx
Right, the IEnumator.MoveNext() automatically breaks if it gets out of scope. But it still shouldn't hurt using yield break, but you're right, it's not required.
ShdNx
+3  A: 

It's certainly not ideal in terms of performance - you end up creating a lot of iterators for large trees, instead of a single iterator which knows how to traverse efficiently.

Some blog entries concerning this:

It's worth noting that F# has the equivalent of the proposed "yield foreach" with "yield!"

Jon Skeet
+1  A: 

A better solution might be to create a visit method that recursively traverses the tree, and use that to collect items up.

Something like this (assuming a binary tree):

public class Node<T>
{
    public void Visit(Action<T> action)
    {
        action(this);
        left.Visit(action);
        right.Visit(action);
    }

    public IEnumerable<Foo> GetAll ()
    {
        var result = new List<T>();
        Visit( n => result.Add(n));
        return result;
    }
}

Taking this approach

  • Avoids creating large numbers of nested iterators
  • Avoids creating any more lists than necessary
  • Is relatively efficient
  • Falls down if you only need part of the list regularly
Bevan
It also takes up O(n) space instead of O(1) space - so it's efficient in terms of computation but not memory.
Jon Skeet
You should use a foreach instead of just left and right. He didn't specify it's a btree.
Maghis
Yes, any node contains 0+ children. But that's not too much of a difference anyway.
mafutrct
+8  A: 

You can improve performance if you unroll recurse to stack, so you will have only one iterator:

public IEnumerable<Foo> GetAll()
{
    Stack<Foo> FooStack = new Stack<Foo>();
    FooStack.Push(this);

    while (FooStack.Count > 0)
    {
        Foo Result = FooStack.Pop();
        yield return Result;
        foreach (Foo NextFoo in Result.MyChildren)
            FooStack.Push(NextFoo);
    }
}
arbiter
But you have more than 1....
leppie
Nope, only one yielded iterator and one MyChildren iterator per node, while original solutions having one yielded iterator per node, and one MyChildren iterator per node, plus recursion.
arbiter
Thank you, I actually used this design in my code. I'm still marking Jon's answer as accepted since his link contain the same idea and he posted earlier. Hope you don't mind. o,o
mafutrct
I wonder if that should be a Queue instead of a Stack.
dangph
Having had to do similar types of implementation workarounds, I've found that `List<T>` ends up having a lot better performance than `Queue<T>` and `Stack<T>`. Also the ability to do `List<T>.AddRange(...)` is a lot faster than a `foreach(...) { Queue<T>Enqueue(...) }`.
McKAMEY