views:

186

answers:

7

I am familiar with the usage of C++ STL iterators, e.g.

for(map<pair<int,int>>::iterator it=m.begin(); it!=m.end(); ++it)
  int  a = it->first;
  int b = it->second;

But I don't know the inner details in it. Could some explain to me? Either in C++, Java, C# or Python.

+1  A: 

Check out the iterator pattern http://www.dofactory.com/patterns/PatternIterator.aspx

Andrew
Thanks. But I am not familiar with UML....
If you are at all interested in patterns, it's a good idea to have a basic understanding of UML, as it is the lingua franca for describing them.
anon
The referenced article is how iterators are in Java and C#, but not in C++ (which the question seems to be about)
David Rodríguez - dribeas
A: 

Aside: the code you've posted won't work - you're missing braces - neither C#, nor C++ nor Java have magic whitespace (and that code really doesn't look like Python).

Also, >> would be interpreted as the right shift operator (thanks Prasoon Saurav). I think you want this:

for(map<pair<int,int> >::iterator it=m.begin(); it!=m.end(); ++it) { // <- here
  int a = it->first;
  int b = it->second;
} // <- and here
Dominic Rodger
In addition `>>` would be interpreted as the right shift operator.
Prasoon Saurav
+1  A: 

One point of using iterators is that you do not need to know about the "inner details".

However, it is always useful to understand the basic principles behind it, and it's the same in all the languages you mention: You have a function (or a language feature) that needs to do something with a range of object. The function can then take an iterator object, and do something like this (using Python):

def DoSomething(start_iter, end_iter):
    while (start_iter != end_iter):
        calculate(start_iter.get_data())
        start_iter = start_iter.next()

The principle is the same in all cases. The generic function accepts iterators and uses them as described above: Get the data associated with the iterator, do whatever with it, increment the iterator, and exit if the iterator has reached the end.

In C++ for instance, an iterator for an STL::vector is very close to being nothing more than a simple integer, and iterating through it is done, under the hood, the same way iterating a pure array is.

+2  A: 

Check the wiki link for Iterator pattern. It is quite explantory (and as other answer and the link mentions) intent of this pattern is to provide access to elements of an aggregate object without exposing internal details.

For example, in Java the iterator for an AbstractList is implemented in an inner class whose instance is created for iterating the List. Check the code here.

sateesh
+2  A: 

iterator is just an abstract notion for which it is defined that dereferenceing it, using the * operator will give you a certain element from the sequence associated with the iterator and incrementing it will give you an iterator associated with the next element in some sequence. This means that the concrete iterator is defined by the sequence and is not a separately defined class, i.e. why you need to use type map<pair<int,int> >::iterator and not just iterator. The iterator type for each STL sequence has it's own implementation for which the ++ operator is overloaded to provide an iterator pointing to the next element inthe sequence.

This means that a simple pointer into a character array is also an iterator, since dereferencing the pointer will give you the object associated with the iterator (pointer) and incrementing it will give you a new iterator (pointer) associated with the next element in the sequence.

A partial example for a doubly linked list, (note: untested, just wrote this up, may need to add some friend clauses and stuff):

class doubly_linked_list {
  class node {
    node* prev;
    node* next;
    int data;
    node(const int data, node* prev, node* next) : data(data), prev(prev), next(next) {}
  };
  static const node nil; // = node(0, 0, 0)
  node* head;
  node* tail;
public:
  class iterator {
    node* current;
    iterator(node* n) : current(n) {}
  public:
    int& operator*() const {
      return current->obj;
    }
    iterator& operator++() {
      current = current->next;
      return *this;
    }
    iterator& operator--() {
      current = current->prev;
      return *this;
    }
  };
  double_linked_list() : head(nil), tail(nil) {}
  iterator begin() const {
    return iterator(head);
  }
  iterator end() const {
    return iterator(tail);
  }
};

And to illustrate how different data structures will have different iterators a vector example (just as untested)

class vector {
  int* data;
  size_t len;
public:
  typedef int* iterator;
  vector(const size_t len) : len(len) {
    data = new int[len];
  }
  int& operator[](int i) const {
    return data[i];
  }
  iterator begin() const {
    return data;
  }
  iterator end() const {
    return data + len;
  }
};
wich
+6  A: 

In C++ iterators into a container resemble pointers into an array (I am assuming that you are familiar with pointers). There are different flavors of iterators, but at the end they are just a way of referring to elements within the container (through the dereference operators * and ->) and transversing the elements in the container.

The important part is not the implementation, but rather the concept. You do not need to know how an iterator into a list or a vector are implemented (or how they differ in many cases), only what operations it provides. Iterators into different containers will have different implementations (for a list it will follow some next pointer in the node, for a map it will follow either the right child or parent pointer of the balanced tree. In fact iterators into the same container can be implemented in different ways (and some compilers do have more than one implementation for any given container) depending on compilation flags or mode. But still, the important part is that you really don't care how they are, just what they allow you to do.

As a simple example, in g++ STL implementation std::vector contains three pointers, something like:

//...
class vector {
   T * _b;  // beginning of allocated memory
   T * _e;  // one past the last inserted element 
   T * _c;  // one past the end of the allocated memory
//...
}

such that size() = (_e - _b) / sizeof(T) and capacity = (_c - _b) / sizeof(T). With this implementation of vector, you can use a raw pointer as an iterator:

//...
class vector {
public:
   typedef T* iterator;
   T* begin() { return _b; }
   T* end() { return _e; }
//...
}

but you can also build more complex (slower but safer) iterator implementations, like checked iterators that will trigger an assertion if the iterator has been invalidated (this code is oversimplified just for sample purposes):

template <typename T>
class checked_iterator {
public:
   checked_iterator( std::vector<T> & v, std::size_t e )
      : _check_begin(v._b), _vector(v), _pos( v._b + e )
   {}
   T& operator*() {
      assert( _check_begin == _vector._b && "Dereferencing an invalidated iterator");
      return *_pos; 
   }
   // ...
private:
   T * _pos;
   T const * const _check_begin;
   std::vector<T>& _vector;
};

This iterator implementation would detect dereferencing an invalid iterator (only in the case of a whole vector relocation, but by storing more data it could do a full check) and will abort the execution of an incorrect program while still in development. From the user point of view it would be a plain RandomAccessIterator (it should be, that is a requirement for vector iterators) but behind the scenes it will provide a mechanism to identify otherwise hard to detect errors.

This is the approach in the VS compilers: while in debug mode (and depending on compiler flags) it will use slow safe iterators that will help detecting access through invalidated iterators (for a vector, an iterator is invalidated whenever an element is added into the container). At the same time, changing the compiler flags you can get the plain raw pointer implementation that will be much faster for production systems, but it would be much harder to debug invalid iterator usage.

In Java and C# they are actually objects that implement a couple of simple operations (in Java hasNext(), next() and remove()) that allow for a transversal of a whole container hidding how the container is implemented. They are quite similar in that the intention is encapsulating the operations performed on a particular container from user code.

One important difference is that in both cases you can use them to iterate over the whole container, but in c++ they are comparable and you can iterate over any two iterators into the same container. For example, in a map containing your city phone book you could use operations to get an iterator into the first name that begins with a c, and another search to get the first element whose name begins with 'd' (assuming name ordering) and you could use any STL (or your own) algorithm with those two iterator to perform the operation only on that subset of people.

David Rodríguez - dribeas
+1, great answer!
Alastair Pitts
+1  A: 

A good example in C# from MSDN.

There is a standard interface, IEnumerable. If your class is enumerable, then you just implements this interface's GetEnumerator method.

But as you may know, different class may have different method to enumerate, for an array, you just move the pointer by 1, for a tree, you need a tree traversal method. But all in all, the enumerator has the same interface, e.g. methods in IEnumerator in C#.

So aside the class implementing IEnumerable, you still need to implement a class that implements IEnumerator, specifically, MoveNext, Reset. And GetEnumerator method returns an instance of the enumerator class.

Yin Zhu