views:

218

answers:

4

I have been thinking about the IEnumerator.Reset() method. I read in the MSDN documentation that it only there for COM interop. As a C++ programmer it looks to me like a IEnumerator which supports Reset is what I would call a forward iterator, while an IEnumerator which does not support Reset is really an input iterator.

So part one of my question is, is this understanding correct?

The second part of my question is, would it be of any benefit in C# if there was a distinction made between input iterators and forward iterators (or "enumerators" if you prefer)? Would it not help eliminate some confusion among programmers, like the one found in this SO question about cloning iterators?

EDIT: Clarification on forward and input iterators. An input iterator only guarantees that you can enumerate the members of a collection (or from a generator function or an input stream) only once. This is exactly how IEnumerator works in C#. Whether or not you can enumerate a second time, is determined by whether or not Reset is supported. A forward iterator, does not have this restriction. You can enumerate over the members as often as you want.

Some C# programmers don't underestand why an IEnumerator cannot be reliably used in a multipass algorithm. Consider the following case:

void PrintContents(IEnumerator<int> xs)
{
  while (iter.MoveNext())
    Console.WriteLine(iter.Current); 
  iter.Reset();
  while (iter.MoveNext())
    Console.WriteLine(iter.Current); 
}

If we call PrintContents in this context, no problem:

List<int> ys = new List<int>() { 1, 2, 3 }
PrintContents(ys.GetEnumerator());

However look at the following:

IEnumerable<int> GenerateInts() {   
  System.Random rnd = new System.Random();
  for (int i=0; i < 10; ++i)
    yield return Rnd.Next();
}

PrintContents(GenerateInts());

If the IEnumerator supported Reset, in other words supported multi-pass algorithms, then each time you iterated over the collection it would be different. This would be undesirable, because it would be surprising behavior. This example is a bit faked, but it does occur in the real world (e.g. reading from file streams).

+2  A: 

Reset was a big mistake. I call shenanigans on Reset. In my opinion, the correct way to reflect the distinction you are making between "forward iterators" and "input iterators" in the .NET type system is with the distinction between IEnumerable<T> and IEnumerator<T>.

See also this answer, where Microsoft's Eric Lippert (in an unofficial capactiy, no doubt, my point is only that he's someone with more credentials than I have to make the claim that this was a design mistake) makes a similar point in comments. Also see also his awesome blog.

Doug McClean
+1 for the links. However, I disagree with your analogy. IEnumerable is more analagous to a C++ container http://www.sgi.com/tech/stl/Container.html
cdiggins
A: 

I don't think so. I would call IEnumerable a forward iterator, and an input iterator. It does not allow you to go backwards, or modify the underlying collection. With the addition of the foreach keyword, iterators are almost a non-thought most of the time.

Opinion: The difference between input iterators (get each one) vs. output iterators (do something to each one) is too trivial to justify an addition to the framework. Also, in order to do an output iterator, you would need to pass a delegate to the iterator. The input iterator seems more natural to C# programmers.

There's also IList<T> if the programmer wants random access.

Matt Brunell
In C++ a forward iterator supports multi-pass algorithms, this is not something that you can reliably do with an IEnumerable/IEnumerator. For example: if the IEnumerable was generated from a yield function.
cdiggins
I don't understand why IEnumerable doesn't support multi-pass algorithms. A custom iterator (using yield return) creates an instance of IEnumerator. You can have multiple IEnumerator objects iterating over the same collection at once.
Matt Brunell
But given just an IEnumerator, you can't do more than one pass over the collection. You need to have access to the underlying collection to perform multipass iteration, which is a bit of a shortcoming. (Since in C++, iterators are designed to cover all your iterating needs. In C#, some fairly basic algorithms (anything requiring more than one pass) have to break the abstraction and require access to the underlying container)
jalf
Also just fyi (this should probably have been explained in the question, since it's going to be read by a lot of non-C++ programmers): In C++ parlance, an *input iterator* is an iterator which supports single-pass traversal, and provides a immutable values. But it's really the single-pass aspect that's important. You can basically consider it a stream wrapper. It doesn't support the notion of "copying the iterators state", so multipass is impossible. A forward iterator can be copied, so it can perform multiple passes.
jalf
+2  A: 

Interesting question. My take is that of course C# would benefit. However, it wouldn't be easy to add.

The distinction exists in C++ because of its much more flexible type system. In C#, you don't have a robust generic way to clone objects, which is necessary to represent forward iterators (to support multi-pass iteration). And of course, for this to be really useful, you'd also need to support bidirectional and random-access iterators/enumerators. And to get them all working smoothly, you really need some form of duck-typing, like C++ templates have.

Ultimately, the scopes of the two concepts are different.

In C++, iterators are supposed to represent everything you need to know about a range of values. Given a pair of iterators, I don't need the original container. I can sort, I can search, I can manipulate and copy elements as much as I like. The original container is out of the picture.

In C#, enumerators are not meant to do quite as much. Ultimately, they're just designed to let you run through the sequence in a linear manner.

As for Reset(), it is widely accepted that it was a mistake to add it in the first place. If it had worked, and been implemented correctly, then yes, you could say your enumerator was analogous to forward iterators, but in general, it's best to ignore it as a mistake. And then all enumerators are similar only to input iterators.

Unfortunately.

jalf
After one cup of coffee, I can't see why cloning is necessary for a multi-pass iterator (ignoring the full requirements of a forward iterator) for now.
cdiggins
How else would you do multiple passes on a forward iterator? It's not bidirectional, so you can't go back to where you were, and iterate again. You have to create a copy of the iterator, so you have two iterators pointing to the same location in the sequence, and then they can be incremented individually.
jalf
If you're thinking that the `Reset()` method could replace cloning, yes, sort of. Except reset only brings you back to one fixed point in the sequence (the beginning). But a forward iterator should be able to do multiple passes on a subset of the data (say, the last three elements only). Reset doesn't really help there. (or at least, it'd be ridiculously slow if you had to do a full reset, and traverse the entire sequence just to get back to where you wanted to perform multi-pass traversal)
jalf
+1  A: 

Coming from the C# perspective:

You almost never use IEnumerator directly. Usually you do a foreach statement, which expects a IEnumerable.

IEnumerable _myCollection;
...
foreach (var item in _myCollection) { /* Do something */ }

You don't pass around IEnumerator either. If you want to pass an collection which needs iteration, you pass IEnumerable. Since IEnumerable has a single function, which returns an IEnumerator, it can be used to iterate the collection multiple times (multiple passes).

There's no need for a Reset() function on IEnumerator because if you want to start over, you just throw away the old one (garbage collected) and get a new one.

Matt Brunell