views:

87

answers:

3

So I finished my List exercise and went ahead with Binary Trees. My code thus far:

Tree.h

#include "Node.h"

class Tree
{
private:
    int mCount;
    Node *root;

public:
    Tree();
    ~Tree();

    void insert(int, Node *);
};

Tree.cpp

void Tree::insert(int data, Node *node)
{
    if( root == 0 )
    {
        Node *temp = new Node;
        temp->setData(100);
        temp->setRight(0);
        temp->setLeft(0);
        root = temp;
    }
    else
    {
        if( data > root->getData() )
            return insert( data, root->getRight() );
        else
            return insert( data, root->getLeft() );
    }
}

main.cpp

int main(int argc, char** argv)
{
    Tree *tree = new Tree;
    tree->insert( 100, 0 );

    std::cin.get();
    return 0;
}

I hope this is sufficient code. Node and Tree are two separate classes. I'm having difficulties wrapping my head around the recursion.

I have Node *root defined in my Tree class to have a root node at the top of the tree. However, the way I see it, when I call tree->insert insert in main, I don't have to specify any node. The root from the Tree class will do all the looking. However, when I'm in the code and need to recur, then I am suddenly a parameter short, as shown above.

My solution was to just place the parameter Node *node in the argument list for insert() anyway and call it with a 0 from main. I would also need to call tree->display(0); as parameter for Node *node as well.

This seems hackish. Am I missing something obvious?

+3  A: 

You are currently not using the node parameter at all in Tree::insert, which effectively means that it will recurse infinitely if you already have a root node.

The best solution would be to define a public insert method without node parameter, which calls another, private insert method with the root parameter, which in turn recursively calls itself. This way your API is clean, not allowing its clients to insert elements directly (and incorrectly) into subtrees.

Note that the parameter itself must be changed to Node** Node*&, since at the point of insertion, you want to modify the pointer in the parent node.

[Update] Also it would be recommended to add a constructor to Node which takes the data value and initializes its left and right pointers to 0. This simplifies the caller code noticeably. [/Update]

So the final result would look something like this:

void Tree::insert(int data)
{
    return insert( data, root );
}

void Tree::insert(int data, Node *&node)
{
    if( node == 0 )
    {
        node = new Node(data);
    }
    else
    {
        if( data > node->getData() )
            return insert( data, node->getRight() );
        else
            return insert( data, node->getLeft() );
    }
}
Péter Török
Thanks for the tipp! Will try this out immediately, but before I do, I have to ask: is this the normal way of inserting stuff into a tree?
SoulBeaver
@SoulBeaver, there is no single "normal" way to do it, but this is one fairly well known way.
Péter Török
@SoulBeaver, the normal way would be to do the insertion iteratively.
avakar
+1 Péter's solution is quite elegant since it only has three cases to check (whether node exists, and if not, whether to insert to left or right child). Another approach is to omit the check for whether the node is null and just do it in the code that checks to see which child to traverse (basically checking if the child is null as opposed to whether the current node is null). If you do this, you'll have to construct the root in your constructor. However, Peter's solution has one less case to check and doesn't require special-casing the root node by creating it in the ctor.
@avakar, good point, that gets rid of the `**` variables which I think is an improvement :-)
Péter Török
@avakar I'd have to disagree if we're generally speaking: traversing trees recursively should be at least as common as doing it iteratively. The iterative version would be faster (though not necessarily by much: consider inlining) but people comfortable with recursion would find it easier to write and would probably prefer that initially at least before profiling (perhaps not in this particular case since the iterative version is so simple as there's no need for a stack). Even dinkumware's red-black tree implementation uses recursion for its insertion method over an iterative approach.
David Rodríguez - dribeas
@stinky472, I was referring to this particular algorithm. Recursion is fine, as long as the tree is balanced (O(n) space on the stack is not a good thing) and you would otherwise have to maintain the stack manually. I also wouldn't depend on the compiler to optimize out tail-recursion as it is difficult for the reader to confirm (in C++).
avakar
... then again if you point out potential for stack overflow, it would be a very valid point in this case given that it's an unbalanced binary tree. That said, this seems to be more of an academic study: people generally don't use unbalanced binary trees in practice, at least not to my knowledge.
@avakar Ah yes, very good point.
Péter Török
+2  A: 

A few points:

First, don't use Node**. That err "uglifies" your code. Use Node*& instead (see answers here) if you really need to.

Second, you do not need a recursive call (unless you want to use one).

Non-recursive insert method:

void Tree::insert(int data)
{
    if(!root)
    {
         root = new Node(data);  // Node constructor should receive
                                 // the data value and init internal value from it
                                 // it should also set left and right pointers to 0
         return;
    }

    Node* insertIterator = root;
    Node* parent = 0;

    while(insertIterator)
    {
         parent = insertIterator;
         insertIterator = data < insertIterator->getData() 
             ? insertIterator->getLeft()
             : insertIterator->getRight();
    }

    if(data < parent->getData())
         parent->setLeft( new Node(data) );
    else
         parent->setRight( new Node(data) );
}

If you do use a recursive method, use a recursive method that finds the insertion point, instead of a recursive method that performs the insertion. Basically, replace the while loop in the code above with a separate method (FindInsertionPoint in my code below):

Node* Tree::FindInsertionPoint(int data, Node * parent) // this should be private
{
    Node* insertPoint = data < parent.getData()
        ? parent->getLeft() 
        : parent->getRight();

    return insertPoint
        ? FindInsertionPoint(data, insertPoint)
        : parent;
}

void Tree::Insert(int data)  // this should be public
{
    if(!root)
    {
        root = new Node(data);
        return;
    }

    Node* parent = FindInsertionPoint(data, root);
    if(data < parent.getData())
        parent->setLeft(new Node(data)); // see comment on Node constructor above
    else
        parent->setRight(new Node(data)); // see comment on Node constructor above
}

Edit:

I'm having difficulties wrapping my head around the recursion.

Look at it like this: to find the insertion point, you know that you need to insert as a child of the left or the right subnode. To insert to the left, you need to insert as a child of the left or the right subnode of the left child of the current node. That is, if you insert to the left, call the find the insertion point part for the left child; otherwise, call the find the insertion point for the right subnode.

What you need to do to define a recursive algorithm:

  • identify the algorithm as it applies to a part of your data (in this case, you need to insert as a child of the left or the right subnode).

  • identify the stopping condition (when is the algorithm stopping?). If you do not, you get infinite recursion and a stackoverflow error :).

  • identify the variable part of the algorithm (this should tell you what parameters your recursive function will have).

utnapistim
+1 for "do this iteratively". I've fixed your iterative code, I didn't read the recursive one. I hope it's correct!
avakar
That's odd, I've been reading stuff on trees and I keep seeing the general concesus to be that, "Recursion with trees is the best way to do it." (CS106 from Stanford I think is one of the sources)Iterative is actually a lot easier to understand as well, I'll try that solution first.
SoulBeaver
@avakar, the recursive code just replaces the while with a recursive version. The rest is the same.
utnapistim
@SoulBeaver, Recursion is easy with trees when you identify the main points of the recursive algorithm (see my **Edit** above).
utnapistim
@SoulBeaver, it depends. You often use recursion when you need to traverse the entire tree. Depth-first search is a good candidate for recursion for example (though I'd be a little cautious in this case as the tree is not balanced). In the case of your code, it's just silly, as you use up O(n) space instead of O(1).
avakar
+1  A: 

Péter gave a perfectly nice example. Only thing I see is that temp->setData(100) should be temp->setData(data);

However, I want to focus on this part:

I'm having difficulties wrapping my head around the recursion.

When you're first introduced to recursion, it can be a little hard to get your mind working that way. We tend to want to think of algorithms as a whole with an ordered list of sequential steps. The best way to get your mind thinking about it is to draw this out. Do it enough and it'll become second nature.

Let's consider Péter's example (ignoring the one slight omission of correcting the setData line). We only have two very simple situations:

1) We're calling insert on a node that exists. In this case, we compare the value we are inserting to the value of the node. If it's greater than the node value, we insert to the right child, otherwise to the left. Now you just repeat the process on the child to which you are inserting.

2) We're calling insert on a node that doesn't exist. In this case, we create the node and set its value to the insertion value. The end: recursion is finished. This is called the base case or general solution because it is where we stop.

That's it - it's actually quite simple. The trick is not to think about what's happening to the tree as a whole, but just one node at a time, and consider all the cases with that one given node. When you can get your mind thinking this way, we find there's only two cases (three if you want to be pedantic, but two general situations).