tags:

views:

878

answers:

4

Hello everyone,

I have to implement a homemade Trie and I'm stuck on the Iterator part. I can't seem to figure out the increment method for the trie.

I hope someone can help me clear things out.

Here's the code for the Iterator:

template <typename T> class Trie<T>::IteratorPrefixe{
friend class Trie<T>;
public:
    IteratorPrefixe() : tree(NULL), currentNode(NULL), currentKey("") {};
    pair<string, T*> operator*() {return make_pair(currentKey, currentNode -> element);} ;
    IteratorPrefixe operator++()throw(runtime_error);
    void operator=(IteratorPrefixe iter) {tree = iter.tree; currentNode = iter.currentNode; currentKey = iter.currentKey;};
    bool operator==(IteratorPrefixe iter) {return tree == iter.tree && currentNode == iter.currentNode;};
    bool operator!=(IteratorPrefixe iter) {return tree != iter.tree || currentNode != iter.currentNode;};
private:
    Trie<T> * tree;
    Trie<T> * currentNode;
    string currentKey;
};

And here's my Trie:

template <typename T> class Trie {
friend class IteratorPrefixe;
public:
    // Create a Trie<T> from the alphabet of nbletters, where nbletters must be
    // between 1 and NBLETTERSMAX inclusively
    Trie(unsigned nbletters) throw(runtime_error);

 // Add a key element of which is given in the first argument and content second argument
    // The content must be defined (different from NULL pointer)
    // The key is to be composed of valid letters (the letters between A + inclusive and exclusive nbletters
    //   Eg  if nblettres is 3, a, b and c are the only characters permitted;
    //    If nblettres is 15, only the letters between a and o inclusive are allowed.
    // Returns true if the insertion was achieved, returns false otherwise.
    bool addElement(string, T*) throw(runtime_error);
    // Deletes a key element of which is given as an argument and returns the contents of the node removed
    // The key is to be composed of letters valid (see above)
    // Can also delete at the same time the reference of the ancestors, if these ancestors are no longer used.
    // Returns NULL if the item has no delete
    T* removeElement(string cle) throw(runtime_error);
    // Find a key element of which is given as an argument and returns the associated content
    // The key is to be composed of letters valid (see above)
    // Returns NULL if the key does not exist
    T* searchElement(string cle) throw();
    // Iterator class to browse the Trie <T> in preorder mode
    class IteratorPrefixe;
    // Returns an iterator pointing to the first element
    IteratorPrefixe pbegin() throw(runtime_error);
    // Returns an iterator pointing beyond the last item
    IteratorPrefixe pend() throw();

private:
    unsigned nbLetters;
    T* element;
    vector<Trie<T> *> childs;
    Trie<T> * parent;

 // This function removes a node and its ancestors if became unnecessary. It is essentially the same work
    // as deleteElement that is how to designate remove a node that is changing. Moreover, unlike
    // deleteElement, it does not return any information on the node removed.
    void remove(Trie<T> * node) throw();

 // This function is seeking a node based on a given key. It is essentially the same work
    // searchElement but that returns a reference to the node found (or null if the node does not exist)
    // The key is to be composed of letters valid (see above)
    Trie<T>* search(string key) throw(runtime_error);
};
+2  A: 

I'm glad to see Tries are still taught, they're an important data structure that is often neglected.

There may be a design problem in your code since you should probably have a Trie class and a Node class. The way you wrote it it looks like each node in your Trie is it's own trie, which can work, but will make some things complicated.

It's not really clear from your question what it is that you are having the problem with: figuring the order, or figuring the actual code?

From the name of the iterator, it sounds like it would have to work in prefix order. Since your trie stores words and its child nodes are organized by letters, then you are essentially expected to go over all the words in an alphabetic order. Every incrementation will bring you to the next word.

THe invariant about your iterator is that at any point (as long as it is valid), it should be pointing at a node with a "terminator character" for a valid word. Figuring that word merely involves scanning upwards through the parent chain till you find your entire string. Moving to the next word means doing a DFS search: go up once, scan for links in later "brothers", see if you find a word, if not recursively go up, etc.

Uri
Good answer. As an aside, sometimes it is better to store stuff entries in reverse order, like domain names.
Josh
A: 

It is actually has you said. Each node of the Trie is it's own Trie.

I'm currently having issue figuring out the code for the IteratorPrefixe operator++ method. I wrote prefix as I am french, I believe the english words would be the Preorder traverse mode.

This structure is imposed to me, so I have to deal with this.

I can't seem to write down anything to figure out how to navigate through the tree. Can't figure out how the iterator will know which branch to choose from each node.

+1  A: 

You may want to see my modified trie implementations at:

Specifically, you may find the discussion I had on comp.lang.c++.moderated about implementing iterators for trie's in a STL compliant way, which is a problem since all stl containers unfortunately are forced to use std::pair<>, and the iterator therefor must contain the value instead of just a reference to the single node in the trie.

jdkoftinoff
There's nothing to be found on your page?
Enno
Sorry - the server was moved, the url should work now. The source is in mercurial at http://opensource.jdkoftinoff.com/hg-oss/jdktrie and also in svn at http://opensource.jdkoftinoff.com/jdks/svn/trunk/jdktrie/trunk
jdkoftinoff
A: 

For one thing, the code shown does not actually describe a trie. Rather, it appears to be a tree containing a pair of elements in each node (T* and unsigned). You can by discipline use a tree of tuples as a trie, but it's only by convention, not enforcement. This is part of why you're having such a hard time implementing operator++.

What you need to do is have each Trie contain a left-right disjoint ADT, rather than just the raw elements. It's a layer of abstraction which is more commonly found in functional languages (e.g. Scala's Either). Unfortunately, C++'s type system isn't quite powerful enough to do something that elegant. However, there's nothing preventing you from doing this:

template <class L, class R>
class Either
{
public:

    Either(L *l) : left(l), right(0)
    {}

    Either(R *r) : left(0), right(r)
    {}

    L *get_left() const
    {
        return left;
    }

    R *get_right() const
    {
        return right;
    }

    bool is_left() const
    {
        return left != 0;
    }

    bool is_right() const
    {
        return right != 0;
    }

private:
    L *left;
    R *right;
};

Then your Trie's data members would be defined as follows:

private:
    Either<unsigned, T*> disjoint;

    vector<Trie<T> *> children;    // english pluralization
    Trie<T> * parent;

I'm playing fast and loose with your pointers, but you get the gist of what I'm saying. The important bit is that no given node can contain both an unsigned and a T*.

Try this, and see if that helps. I think you'll find that being able to easily determine whether you are on a leaf or a branch will help you tremendously in your attempt to iterate.

Daniel Spiewak