views:

283

answers:

4

I found similar topic:
http://stackoverflow.com/questions/56347/iterators-in-c-stl-vs-java-is-there-a-conceptual-difference

Which basically deals with Java iterator (which is similar to C#) being unable to going backward.

So here I would like to focus on limits -- in C++ iterator does not know its limit, you have by yourself compare the given iterator with the limit. In C# iterator knows more -- you can tell without comparing to any external reference, if the iterator is valid or not.

I prefer C++ way, because once having iterator you can set any iterator as a limit. In other words if you would like to get only few elements instead of entire collection, you don't have to alter the iterator (in C++). For me it is more "pure" (clear).

But of course MS knew this and C++ while designing C#. So what are the advantages of C# way? Which approach is more powerful (which leads to more elegant functions based on iterators). What do I miss?

If you have thoughts on C# vs. C++ iterators design other than their limits (boundaries), please also answer.

Note: (just in case) please, keep the discussion strictly technical. No C++/C# flamewar.

EDITS

As Tzaman put it "There's no benefit to having the limit be expressed separately, because there's no way to get there except walking one element at a time." However it is not that difficult to built a C# iterator which does several steps a time, thus the question -- is there a benefit of having explicit limit iterator (as in C++)? If yes -- what?

@Jon,

Edit1: let's say you have a function foo which does something on iterators (the example is very naive!)

void foo(iter_from,iter_end) // C++ style
void foo(iter) // C# style 

And now you would like to call function bar on all elements except last 10.

bar(iter_from,iter_end-10); // C++ style

In C# (if I am not mistaken) you would have to provide extra method for this iterator to change its limit, something like this:

bar(iter.ChangeTheLimit(-10));

Edit2: After rereading of your post, I sense crucial difference. In C++ you work on iterators of collection, in C# you work on collections (iterator is used "internally"). If yes, I still feel a bit uncomfortable with C# -- you iterate the collection and when you find interesting element you would like to pass all elements from "here" to end. In C++ it is very easy and there is no overhead. In C# you either pass an iterator or a collection (if the latter there will be extra computing). I'll wait for your comment :-)

@Hans,

I am not comparing apples and oranges. The comp. theory is common ground here, so you have sort algorithm, partition, etc. You have the notion of collection (or sequence, as Jon likes). Now -- the matter is how you design the access to the elements to have elegant algorithm written in C# or C++ (or any other language). I would like to understand the reasoning "we did this, because...".

I know .NET iterators and collections are separate classes. I know the difference between access to an element and entire collection. However in C++ the most general way to work on a collection is working with iterators -- this way you can work with list and vector despite those collections are completely different. In C# on the other hand you rather write

sort(IEnumerable<T> coll) 

function instead

sort(IEnumerator<T> iter)

correct? So in this sense I guess you cannot take C# iterators as C++ iterators, because C# does not express the same algorithms the same way as C++ does. Or as Jon pointed out -- in C# you rather transform collection (Skip, Take) instead of changing iterators.

+5  A: 

It feels to me like the C# design is more encapsulated: a sequence runs out or does, independent of anything else. Where does it make sense to compare one limit with another?

If you only want to take a few elements, then LINQ provides any number of ways of building one sequence from another, e.g.

foo.Take(10)
foo.Skip(10)
foo.Where(x => x.StartsWith("y"))

etc

I think it's clearer - and more composable - to transform one sequence into another than to specify this with limits. If you want to pass an iterator to another function, but you want to restrict it to the first few elements, why should you have to also pass a limit? Why not just pass the transformed sequence which self-limits?

EDIT: To address your question edit: in C# (at least with LINQ) you wouldn't modify the existing collection. You would create a new sequence from the old one. This is done lazily; it doesn't create a new copy or anything like that. For LINQ to Objects, this is performed using extension methods on IEnumerable<T>, so any sequence gains the same abilities.

Note that this isn't restricted to traditional collections - it could be a sequence of lines read from a log file (again, lazily). You don't need any knowledge of the collection itself, just that it's a sequence from which one can draw items. There's also the difference between an IEnumerable<T> and an IEnumerator<T> where the first represents the sequence and the second represents an iterator over a sequence; IEnumerator<T> is rarely used explicitly in C#, or passed around.

Now, your example of "all elements except the last 10" is a tricky one, because for a general sequence you can't tell that you're 10 elements from the end until you've reached the end. There's nothing in LINQ to Objects to do this explicitly. For anything implementing ICollection or ICollection<T> you could use

Bar(list.Take(list.Count - 10))

but that's not very general. A more general solution would need to maintain a circular buffer of 10 elements, effectively reading 10 ahead of where it's yielding. I've rarely found this to be a requirement, to be honest.

Jon Skeet
Yeah...this definitely looks nicer, but what did they expect us to do *before* LINQ?
Mark
Thank you for your answer. In your example and below you assumed your "input" is a collection. But what if it is already an iterator? Since comments do not format well, I edit my post.
macias
@macias: Answer duly edited.
Jon Skeet
@Mark: In C# 1 it would have been pretty ugly to do this. In C# 2 you could build your own pseudo-LINQ reasonably easily using iterator blocks - it would lack the conciseness of lambda expressions and extension methods though.
Jon Skeet
@Jon, I know it is not about modifying collection elements :-) Now, about the example -- you cannot rely on fact you get Count for free, for example for linked list it is O(n) to get it, so C++ in such case would be much faster. And besides, you again used _collection_ instead of _iterator_. So I am more and more curious -- if really C++ iterators are C# collections (not iterators) counterpart? Or maybe I rephrase my question -- C++ STL general functions (#include <algorithm>) work on iterators, C# STL would work on _collections_ instead, correct?
macias
@macias: You get Count for O(1) in a .NET `LinkedList<T>`... and you're also assuming you can easily get to "10 from the end" in a linked list, which will require it to be a doubly-linked list. But yes, I'll acknowledge that in the case of a doubly-linked list which doesn't maintain a count, getting all but the final X elements is trickier in .NET. I can't say I've come across that exact situation though :) In terms of collections vs iterators: it would be more accurate to say that .NET APIs often work in terms of *sequences* rather than collections. Not every sequence is a normal collection.
Jon Skeet
An expression like list.end()-n is O(n) in a std::list as well. And O(1) for a std::vector, just like List<>. No difference there.
Hans Passant
@Jon, so, C# STL counterpart would be based on sequences, not iterators. Ok, that's a shift for my thinking :-) Thank you.@Hans, good point. The exact steps would differ though.
macias
+2  A: 

So what are the advantages of C# way?

Encapsulation for one. It makes it more difficult to mess up your iteration limits if you do not set the iteration sequence manually.

C++ example:

std::vector<int> a;
std::vector<int> b;
std::vector<int>::iterator ba = a.begin();
std::vector<int>::iterator ea = a.end();
std::vector<int>::iterator bb = b.begin();
std::vector<int>::iterator eb = b.end();
// lots of code here
for(auto itr = ba; itr != eb; itr++) // iterator tries to "cross" vectors
                                     // you get undefined behavior, depending 
                                     // on what you do with itr

Which approach is more powerful (which leads to more elegant functions based on iterators).

That depends on what you mean by most powerful.

The C++ approach is more flexible.

The C# one is more difficult to misuse.

utnapistim
Your example is very convincing, thank you for that! Powerful = you can express most algorithms (sort, partition, find, any, all and so on) in clear fashion.
macias
@macias: You can express partition, find, any and all with C# sequences as well (and incredibly easily, using lambda expressions).
Jon Skeet
For that definition C++ iterators are definitely more powerful. You have i/o iterators, specifying custom ranges for algorithms (` std::sort(c.begin(), c.begin()+3)`), use the same algorithms with iterators _and raw pointers_) and so on.
utnapistim
+5  A: 

I think as long as you're talking only about forward iterators, I like the C# way - the sequence knows its own limit, you can transform it easily via LINQ, and so on. There's no benefit to having the limit be expressed separately, because there's no way to get there except walking one element at a time.

However, C++ iterators are a lot more powerful - you can have bidirectional iterators, or random-access iterators, that let you do things that aren't possible with a one-way sequential iterator. A sequential iterator is in a sense a generalization of a linked list, while a random access iterator is a generalization of an array. That's why you can efficiently express algorithms like quicksort in terms of C++ iterators, while that would be impossible with an IEnumerable.

tzaman
Oh, I like your answer -- however could you please explain a bit the last statement. Or post a link to some ext. reference. About the various kinds of iterators -- one does not deny the other, i.e. you could have C# iterator (with limit), yet bidirectional.
macias
For "random access iterator" you'd just use `IList<T>` of course. The reason for my +1 here was the idea of a bidirectional iterator, which .NET doesn't support. I can't say it's something I'd want terribly often, but it's definitely an interesting omission.
Jon Skeet
@Jon - of course; still an `IList<T>` is (in macias' terms) a representation of the _collection_ itself rather than a _position_ in that collection. I suppose the C++ iterator concept was the best way to expose a generic interface to a collection, without having actual interfaces in the language - definitely much better than forcing an ABC inherit just to be able to use STL `algorithm` / etc. Any guesses as to why there's no .NET `Deque<T>` / `IBidirectional`? They _are_ useful...
tzaman
@tzman: In many cases you can just use a `LinkedList<T>` as a `Deque<T>` although that comes with a memory cost. I don't know why there isn't a circular buffer implementation, other than that implementing *anything* costs time... I suspect the value hasn't been deemed great enough.
Jon Skeet
+2  A: 

You are comparing apples and oranges. The STL collection classes had completely different design goals. They are a one-size-fits-all design to collections, you can only ever get the classes that are provided. There's no way to customize a collection, the STL classes were not designed to be inherited from.

Very different from the .NET ones, their core is the ICollection, IDictionary and IList interfaces. The .NET framework contains hundreds of collection classes, customized to get their specific job done.

That affected the designs of their iterators as well. STL iterators must provide a lot of flexibility since the collections they iterate cannot be customized. There is an inherent cost associated with that. There are 5 different STL iterator types, not all collection classes support all 5 different types. Loss of generality is never a good thing in a library design.

While an argument can be made that C++ iterators are more flexible, I'll make the opposite assertion. The simplicity of IEnumerator was an enabler for very valuable .NET features. The C# yield keyword for example, it completely disassociates a collection with the behavior of an iterator. Linq would not have happened if it wasn't for a simple iterator type. Covariance support would not have been possible.

Regarding your Edit2, no, .NET iterators are separate classes, just like the C++ ones. Make sure to understand the difference between IEnumerable (implemented by the collection) and IEnumerator (implemented by the iterator).

Hans Passant
Ok, a lot will go to edit :-) I think you get me convinced with yield example. I.e. with two iterators for defining the boundaries you have to know in advance the limit (the limit can change, but it has to be there). So yield counterpart would have to transfer two values each time, meaning current iterator would have to know its limit --> which leads us to C# iterator. Nice.
macias