views:

380

answers:

3

Hi,

I'm a having a little trouble thinking how the hell do I fix the appropriate pointers when trying to delete a node from a binary tree where that node has 2 children.

I understand the basic concept and I'm basically just having trouble fixing the pointers...

So, basically, my delete function is mostly done and every case is already working (as far as my extensive testing go, everything worked OK), I'm only missing the case node with 2 children. Let's say we are inside the else if that deals with that case:

I will have 2 pointers there, currPtr which holds the node to be deleted, prevPtr which holds the parent node. I also have a variable named fromLeft which defines if the currPtr parent comes from the left or right pointer on prevPtr. Then, I have a call to another function named extractMin(Tree *t) which extracts the lowest value in the tree. If I call this function with currPtr->right as argument, the successor of currPtr is going to be extracted from the tree (the function will deleted it from tree, fix the pointers and return the node pointer). The successor pointer is going to be saved on tempPtr.

Here's the structure of my tree:

typedef int TreeElement;

typedef struct sTree {
   TreeElement item;

   struct sTree *left;
   struct sTree *right;
} Tree;

And the code for the extract function:

Tree *extractMin(Tree *tree) {
    Tree *prevPtr = NULL;
    Tree *currPtr = tree;

    while(currPtr->left) {
        prevPtr = currPtr;
        currPtr = currPtr->left;
    }

    if(prevPtr) prevPtr->left = currPtr->right;

    // inorder successor
    return currPtr;
}

The code above is missing the case here the tree only has one node, the root node, it won't work properly and it also doesn't check if the tree has any nodes at all but, it works when a tree has a few nodes.

So, how do I fix the necessary pointers on the else if for the node deletion? Also, remember that the left and right pointers on the tree nodes will always be pointing somewhere or be NULL.

By the way, I want to do this iterative.

+2  A: 

Updated: So you want to keep the ordering of the nodes, by replacing the node with its direct inorder successor or predecessor.

Let's suppose the tree below is ordered. The order of the nodes is:

H < D < I < B < J < E < K < A < F < M < C < N < G < O

You want to delete a node (A) which has two children. You pull up either its inorder predecessor (K) or successor (F) child of the node in place of the original. Let's pick the successor. This is how you find it: you traverse the left children of C until you find one which has no left child; this is the direct inorder successor of A.

       A
   B       C
 D   E   F   G
H I J K   M N O

So F is pulled up, and the left subtree of A is not touched. However, now M should be pulled up as well, to become the left child of C (this is done in your extractMin()).

After all rearrangements, you get

       F
   B       C
 D   E   M   G
H I J K     N O

In code, this is my solution. I added a NULL check to extractMin() to simplify the caller code, so I don't need else if.

Updated extractMin() to cover the case when tree has no children.

Tree *extractMin(Tree *parent, Tree *tree) {
    if (!tree) return NULL;

    Tree *prevPtr = NULL;
    Tree *currPtr = tree;

    while(currPtr->left) {
        prevPtr = currPtr;
        currPtr = currPtr->left;
    }

    if (prevPtr) {
        prevPtr->left = currPtr->right;
    } else if (parent) {
        parent->right = currPtr->right;
    }

    // inorder successor
    return currPtr;
}

// prevPtr is the parent, currPtr is the node to be deleted
Tree *successor = extractMin(currPtr, currPtr->right);
successor->left = currPtr->left;
successor->right = currPtr->right;
if (fromLeft) {
    prevPtr->left = successor;
} else {
    prevPtr->right = successor;
}
// Now currPtr can be destroyed

Note that this code is not tested, so I don't guarantee anything :-)

Note that repeated deletes like this may make the tree unbalanced (that is, some paths will get much longer than the others). Binary sort trees do deletions this way, but also use rebalancing after to keep the tree close to the ideal (where each leaf is at the same level, like in my first example above).

Péter Török
I've always read to replace the node with it's inorder successor or predecessor (and it's the way I want to do it) and not what you are suggesting. And I already know how to do it, in theory I know at least, I just can't seem to code it properly. The "side you are leaving to me" is the side I came to ask help for. :(
Nazgulled
Yes, I know about unbalanced trees, I'm just doing one thing at a time :) You're code looks like something I tried before which had a problem, doing `successor->right = currPtr->right` will create an infinite loop (when traversing) if the extracted node happens to be the only node on the right side of `currPtr`.
Nazgulled
@Nazgulled Good point. I should've included the disclaimer that the code is not tested :-) Now I fixed both issues.
Péter Török
About the unbalanced thing and as "quick and dirty" workaround (not trying to replace real functions to balance the tree) would it help if I switched back and forth between the successor and predecessor when deleting nodes?
Nazgulled
@Nazgulled Hmmm... I am not an expert on this, so it's only my feeling that if you always pulled up the successor, in the long run the right subtrees in your hierarchy would become shorter. Thus it certainly wouldn't hurt to alternate between the successor and predecessor, and probably even help a bit. Depends also on how (often) you insert nodes into the tree, though.
Péter Török
+1  A: 

The textbook answer is to replace the node in question with it's left-most right descendant.

                6
         3             8
     2      4      7      9
   1          5             10

If we want to remove 3, we can replace it with 4, and then pull 5 into the hole where 4 went. We can always do this, and it will preserve the in-order traversal.

OK. looking at your code, this is what you want:

//in your function
    else if (/*has 2 nodes*/) {
        currPtr->item = extractMin(currPtr->right, &(currPtr->right))->item;
    }

Tree *extractMin(Tree *tree, Tree ** back) {
    Tree *currPtr = tree;

    while(currPtr->left) {
        back    = &(currPtr->left);
        currPtr = currPtr->left;
    }

    *back = currPtr->right;

    // inorder successor
    return currPtr;
}

The ** argument allows us to handle the case where we use the deleted nodes immediate right child:

     3   <--deleting this node
   /   \   <--back points to this edge.
  2     4   <--extractMin is called on this node and returns this node,
         \
          5   <-- (*back) now points to this node. 

Think about the new ExtractMin in 2 cases.

  1. In the first case, we go through the loop at least once. If we had left prevPtr in the code, we would see that after the loop, back == &(prevPtr->left); (e.g. modifying *back will modify prevPtr->left). So it's the same as your code above.
  2. In the second case we don't go through the loop. In this case, back points to a link that we couldn't get in any other way (unless we modified extractMin to start one level higher).

Another way to think about it is that (*back) always points to currPtr (take a moment and check this), so back points to the edge we need to modify if we're removing currPtr.

Kyle Butt
I already know that, I even described it in my question that I was doing just that. Or trying to... The problem is not understanding what to do, is how to do it, based on my code and tree structure.
Nazgulled
It's not immediately clear from your question that's what you're trying to do. I added a code section.
Kyle Butt
Sorry, I tried my best at explaining my problem. Your solution is a little trickier... I have to think about it to see if I can understand it.
Nazgulled
A: 

If you're serious about this, take a look at AVL trees:

http://en.wikipedia.org/wiki/AVL_tree

They can be tricky to implement, but will stay balanced due to the rotations you perform on insertions and deletions.

tomlogic