tags:

views:

158

answers:

4

In this question I'm not asking how to do it but HOW IS IT DONE.
I'm trying (as an excersise) implement simple map and although I do not have problems with implementing links and they behavior (how to find next place to insert new link etc.) I'm stuck with the problem how to implement iterating over a map. When you think about it and look at std::map this map is able to return begin and end iterator. How? Especially end?
If map is a tree how can you say which branch of this map is an end? I just do not understand it. An how to iterate over a map? Starting from the top of the tree and then what? Go and list everything on the left? But those nodes on the left have also links to the right. I really don't know. I will be really glad if someone could explain it to me or give me a link so I could read about it.
Thank you.

A: 

For sorting purposes, a map behaves like a sorted key/value container (a.k.a. a dictionary); you can think of it as a sorted collection of key/value pairs, and this is exactly what you get when you query for an iterator. Observe:

map<string, int> my_map;
my_map["Hello"] = 1;
my_map["world"] = 2;
for (map<string, int>::const_iterator i = my_map.begin(); i != my_map.end(); ++i)
    cout << i->first << ": " << i->second << endl;

Just like any other iterator type, the map iterator behaves like a pointer to a collection element, and for map, this is a std::pair, where first maps to the key and second maps to the value.

std::map uses a binary search internally when you call its find() method or use operator[], but you shouldn't ever need to access the tree representation directly.

tdammers
std::map is always implemented as a balanced binary tree
anon
@Neil, it's usually implemented as a red-black tree (which is only partially balanced).
avakar
@avakar I think most people would refer to RB trees as balanced - I agree that they are not perfectly balanced.
anon
This I didn't know. I would have thought that it's just a plain old key/value array, and that the binary search would be performed on the array directly. For purposes of iterating, it behaves exactly like a plain list though, so my main point still stands. I edited the text to be more correct regarding internal implementation.
tdammers
@tdammers: insertion into an array would be too slow; it takes linear time to move the following items to make room, while `map::insert` must take logarithmic time or better. A (reasonably well) balanced tree makes both insertion and lookup possible in logarithmic time.
Mike Seymour
+2  A: 

The representation of your map's iterator is totally up to you. I think it should suffice to use a single wrapped pointer to a node. E.g.:

template <typename T>
struct mymapiterator
{
  typename mymap<T>::node * n;
};

Or something similar. Now, mymap::begin() could return such instance of the iterator that n would point to the leftmost node. mymap::end() could return instance with n pointing to root probably or some other special node from which it is still possible to get back to rightmost node so that it could satisfy bidirectional iteration from end iterator.

The operation of moving between the nodes (operators++() and operator--(), etc.) are about traversing the tree from smaller to bigger values or vice versa. Operation that you probably have already implemented during insertion operation implementation.

wilx
He's probably implemented traversals as a recursive function that keeps track of state implicitly in the execution stack and the program counter -- the hard part is how to make this explicit in a language that doesn't have coroutines.
Ken Bloom
+3  A: 

A map is implemented using a binary search tree. To meet the complexity requirements it has to be a self-balancing tree, so a red-black tree is usually used, but that doesn't affect how you iterate over the tree.

To read the elements out of a binary search tree in order from least to greatest, you need to perform an in-order traversal of the tree. The recursive implementation is quite simple but isn't really practical for use in an iterator (the iterator would have to maintain a stack internally, which would make it relatively expensive to copy).

You can implement an iterative in-order traversal. This is an implementation taken from a library of tree containers I wrote a while ago. NodePointerT is a pointer to a node, where the node has left_, right_, and parent_ pointers of type NodePointerT.

// Gets the next node in an in-order traversal of the tree; returns null
// when the in-order traversal has ended
template <typename NodePointerT>
NodePointerT next_inorder_node(NodePointerT n)
{
    if (!n) { return n; }

    // If the node has a right child, we traverse the link to that child
    // then traverse as far to the left as we can:
    if (n->right_)
    {
        n = n->right_;
        while (n->left_) { n = n->left_; }
    }
    // If the node is the left node of its parent, the next node is its
    // parent node:
    else if (n->parent_ && n == n->parent_->left_)
    {
        n = n->parent_;
    }
    // Otherwise, this node is the furthest right in its subtree; we 
    // traverse up through its parents until we find a parent that was a
    // left child of a node.  The next node is that node's parent.  If 
    // we have reached the end, this will set node to null:
    else
    {
        while (n->parent_ && n == n->parent_->right_) { n = n->parent_; }
        n = n->parent_;
    }
    return n;
}

To find the first node for the begin iterator, you need to find the leftmost node in the tree. Starting at the root node, follow the left child pointer until you encounter a node that has no left child: this is the first node.

For an end iterator, you can set the node pointer to point to the root node or to the last node in the tree and then keep a flag in the iterator indicating that it is an end iterator (is_end_ or something like that).

James McNellis
@James Thanks for your answer.
There is nothing we can do
A: 

One big trick you may be missing is that the end() iterator does not need to point to anything. It can be NULL or any other special value.

The ++ operator sets an iterator to the same special value when it goes past the end of the map. Then everything works.

To implement ++ you might need to keep next/prev pointers in each node, or you could walk back up the tree to find the next node by comparing the node you just left to the parent's right-most node to see if you need to walk to that parent's node, etc.

Don't forget that the iterators to a map should stay valid during insert/erase operations (as long as you didn't erase the iterator's current node).

Zan Lynx