views:

544

answers:

6

I'd like to create a copy of an IEnumerator<T> so that I can restart the enumeration process from a particular location in the collection. Clearly, there is no benefit to doing so for collections that implement IList, since we can remember the index of interest.

Is there a clever way to accomplish this task using a combination of yield statements and Linq functions? I could not find a suitable Clone() method to copy the enumerator, and would like to avoid using Enumerable.Skip() to reposition a new enumerator to the desired resumption point.

Also, I'd like to keep the solutions as generic as possible, and not have to depend on state from any concrete collections.

+2  A: 

The best you could do is write something that keeps a buffer (perhaps a Queue<T>) of the data consumed from one and not the other (which would get messy/expensive if you advanced one iterator by 1M positions, but left the other alone). I really think you would be better off rethinking the design, though, and just using GetEnumerator() (i.e. another foreach) to start again - or buffer the data (if short) in a list/array/whatever.

Nothing elegant built in.


Update: perhaps an interesting alternative design here is "PushLINQ"; rather than clone the iterator, it allows multiple "things" to consume the same data-feed at the same time.

In this example (lifted from Jon's page) we calculate multiple aggregates in parallel:

// Create the data source to watch
DataProducer<Voter> voters = new DataProducer<Voter>();

// Add the aggregators
IFuture<int> total = voters.Count();
IFuture<int> adults = voters.Count(voter => voter.Age >= 18);
IFuture<int> children = voters.Where(voter => voter.Age < 18).Count();
IFuture<int> youngest = voters.Min(voter => voter.Age);
IFuture<int> oldest = voters.Select(voter => voter.Age).Max();

// Push all the data through
voters.ProduceAndEnd(Voter.AllVoters());

// Write out the results
Console.WriteLine("Total voters: {0}", total.Value);
Console.WriteLine("Adult voters: {0}", adults.Value);
Console.WriteLine("Child voters: {0}", children.Value);
Console.WriteLine("Youngest vote age: {0}", youngest.Value);
Console.WriteLine("Oldest voter age: {0}", oldest.Value);
Marc Gravell
@Marc..do you ever sleep ?
Stan R.
Nope, he covers Jon Skeet's account when he's asleep. Jon skeet doesn't sleep, so he (Skeet) covers Marc's account when he's supposed to be asleep. ;)
RCIX
pah! Sleep is for the weak! Next you'll be suggesting a lunch break? Madness I tell you, madness!
Marc Gravell
A: 

So what you really want is to be able to resume an iteration later, am I correct? And cloning the enumerator or collection is how you think you'd do such a thing?

You could make a class which wraps an IEnumerable, and exposes a custom enumerator which, internally, clones the inner IEnumerable, and then enumerates over that. Then, using GetEnumerator() would give you an enumerator which could be passed around.

This would create an extra copy of the IEnumerable for each Enumerator "in flight," but I think it would meet your needs.

kyoryu
I'd like to resume iteration later, after the original enumerator has run its course. Cloning the collection to an `IList<T>` makes the solution easy, as I can write my own iterator class that tracks indexes, but I'd like to avoid this if possible.Could you give a short psuedo-code example for your solution? I'm not clear on how you intend on cloning the inner `IEnumerable` instance.
Steve Guidi
Ah, so you want to do an iteration from one point, and then resume that iteration back at some previous point. I'd just make a new collection then, as any solution to this problem is going to reduce to that anyway.
kyoryu
+1  A: 

Do you want to be able to save the state, continue the enumeration, then return to the saved state, or do you want to simply be able to enumerate, do some other stuff, then continue the enumeration?

If it's the latter, something like the following might work:

public class SaveableEnumerable<T> : IEnumerable<T>, IDisposable
{
    public class SaveableEnumerator : IEnumerator<T>
    {
        private IEnumerator<T> enumerator;

        internal SaveableEnumerator(IEnumerator<T> enumerator)
        {
            this.enumerator = enumerator;
        }

        public void Dispose() { }

        internal void ActuallyDispose()
        {
            enumerator.Dispose();
        }

        public bool MoveNext()
        {
            return enumerator.MoveNext();
        }

        public void Reset()
        {
            enumerator.Reset();
        }

        public T Current
        {
            get { return enumerator.Current; }
        }

        object IEnumerator.Current
        {
            get { return enumerator.Current; }
        }
    }

    private SaveableEnumerator enumerator;

    public SaveableEnumerable(IEnumerable<T> enumerable)
    {
        this.enumerator = new SaveableEnumerator(enumerable.GetEnumerator());
    }

    public IEnumerator<T> GetEnumerator()
    {
        return enumerator;
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return enumerator;
    }

    public void Dispose()
    {
        enumerator.ActuallyDispose();
    }
}

Now you can do:

using (IEnumerable<int> counter = new SaveableEnumerable<int>(CountableEnumerable()))
{
    foreach (int i in counter)
    {
        Console.WriteLine(i);
        if (i > 10)
        {
            break;
        }
    }
    DoSomeStuff();
    foreach (int i in counter)
    {
        Console.WriteLine(i);
        if (i > 20)
        {
            break;
        }
    }
}
ICR
Thanks for the idea, but I'm looking for a solution to the former of your examples.
Steve Guidi
+1  A: 

This is completely not an answer, but the thought experiment I found interesting...if you've got a yield-based IEnumerable, I suppose you know it's all compiler-generated magic. If you have such a beast, you could do something like this... ;)

class Program
{
    static void Main(string[] args)
    {
        var bar = new Program().Foo();

        // Get a hook to the underlying compiler generated class
        var barType = bar.GetType().UnderlyingSystemType;
        var barCtor = barType.GetConstructor(new Type[] {typeof (Int32)});
        var res = barCtor.Invoke(new object[] {-2}) as IEnumerable<int>;

        // Get our enumerator
        var resEnum = res.GetEnumerator();
        resEnum.MoveNext();
        resEnum.MoveNext();
        Debug.Assert(resEnum.Current == 1);

        // Extract and save our state
        var nonPublicMap = new Dictionary<FieldInfo, object>();
        var publicMap = new Dictionary<FieldInfo, object>();
        var nonpublicfields = resEnum.GetType().GetFields(BindingFlags.NonPublic | BindingFlags.Instance);
        var publicfields = resEnum.GetType().GetFields(BindingFlags.Public | BindingFlags.Instance);
        foreach(var field in nonpublicfields)
        {
            var value = field.GetValue(resEnum);
            nonPublicMap[field] = value;
        }
        foreach (var field in publicfields)
        {
            var value = field.GetValue(resEnum);
            publicMap[field] = value;                
        }

        // Move about
        resEnum.MoveNext();
        resEnum.MoveNext();
        resEnum.MoveNext();
        resEnum.MoveNext();
        Debug.Assert(resEnum.Current == 5);

        // Restore state            
        foreach (var kvp in nonPublicMap)
        {
            kvp.Key.SetValue(resEnum, kvp.Value);
        }
        foreach (var kvp in publicMap)
        {
            kvp.Key.SetValue(resEnum, kvp.Value);                
        }

        // Move about
        resEnum.MoveNext();
        resEnum.MoveNext();
        Debug.Assert(resEnum.Current == 3);
    }

    public IEnumerable<int> Foo()
    {
        for (int i = 0; i < 10; i++)
        {
            yield return i;
        }
        yield break;
    }

}
JerKimball
This is quite clever; I didn't consider using reflection to clone the object. The key to this working though appears to be ensuring that the reflection code to "save" state, makes a copy of the field values, as opposed to holding a reference.
Steve Guidi
Yeah, the "state variables" within the generated enumeration class are the things one would need to bookmark in order to do a continuation-like move like this; of course, since all of the state and value bits and bobs are name-mangled, the only "safe" (and I use the term loosely) way is to copy all of the internal state fields. To be precise, tho, you wouldn't be copying the whole enumeration here, probably 3-4 fields on average.
JerKimball
(of course, one could try to implement a proper continuation monad in C#, but that's a whole 'nother animal) ;)
JerKimball
Your answer inspired me to make a full clone tool. You can save the spot and resume later. Since the title said clone, I created a new instance.
BenMaddox
A: 

JerKimball had an interesting approach. I try to take it to the next level. This uses reflection to create a new instance and then sets the values on the new instance. I also found this chapter from C# in Depth to be very useful. Iterator block implementation details: auto-generated state machines

static void Main()
{
    var counter = new CountingClass();
    var firstIterator = counter.CountingEnumerator();
    Console.WriteLine("First list");
    firstIterator.MoveNext();
    Console.WriteLine(firstIterator.Current);

    Console.WriteLine("First list cloned");
    var secondIterator = EnumeratorCloner.Clone(firstIterator);

    Console.WriteLine("Second list");
    secondIterator.MoveNext();
    Console.WriteLine(secondIterator.Current);
    secondIterator.MoveNext();
    Console.WriteLine(secondIterator.Current);
    secondIterator.MoveNext();
    Console.WriteLine(secondIterator.Current);

    Console.WriteLine("First list");
    firstIterator.MoveNext();
    Console.WriteLine(firstIterator.Current);
    firstIterator.MoveNext();
    Console.WriteLine(firstIterator.Current);
}

public class CountingClass
{
    public IEnumerator<int> CountingEnumerator()
    {
        int i = 1;
        while (true)
        {
            yield return i;
            i++;
        }
    }
}

public static class EnumeratorCloner
{
    public static T Clone<T>(T source) where T : class, IEnumerator
    {
        var sourceType = source.GetType().UnderlyingSystemType;
        var sourceTypeConstructor = sourceType.GetConstructor(new Type[] { typeof(Int32) });
        var newInstance = sourceTypeConstructor.Invoke(new object[] { -2 }) as T;

        var nonPublicFields = source.GetType().GetFields(BindingFlags.NonPublic | BindingFlags.Instance);
        var publicFields = source.GetType().GetFields(BindingFlags.Public | BindingFlags.Instance);
        foreach (var field in nonPublicFields)
        {
            var value = field.GetValue(source);
            field.SetValue(newInstance, value);
        }
        foreach (var field in publicFields)
        {
            var value = field.GetValue(source);
            field.SetValue(newInstance, value);
        }
        return newInstance;
    }
}
BenMaddox
A: 

There's no general way to do this, since an iEnumerable may depend upon arbitrary aspects of system state which cannot be detected via Reflection or any other means. For example, a PaperTapeReader class might implement an enumerator which reads characters from the tape until the sensor indicates there's no more tape in the machine. The state of such an enumerator would be the physical location of the tape, which might be impossible to restore programmatically.

Given an iEnumerable, it would be possible to produce two or more iEnumerables, each of which would act like either the original or clone thereof. MoveNext requests for the one that was 'furthest along' would read new data from the original iEnumerable and buffer it for the others. Unless the original iEnumerable supports such 'hook' functionality, however, I don't think there'd be any way to latch onto its data as it comes in.

supercat