When deleting a node with two children, you can either choose its in-order successor node or its in-order predecessor node. In this case it's finding the the largest value in the left sub-tree (meaning the right-most child of its left sub-tree), which means that it is finding the node's in-order predecessor node.
Once you find the replacement node, you don't actually delete the node to be deleted. Instead you take the value from the successor node and store that value in the node you want to delete. Then, you delete the successor node. In doing so you preserve the binary search-tree property since you can be sure that the node you selected will have a value that is lower than the values of all the children in the original node's left sub-tree, and greater that than the values of all the children in the original node's right sub-tree.
EDIT
After reading your question a little more, I think I have found the problem.
Typically what you have in addition to the delete
function is a replace
function that replaces the node in question. I think you need to change this line of code:
FindParent(largestValue).Right <- 0
to:
FindParent(largestValue).Right <- largestValue.Left
If the largestValue
node doesn't have a left child, you simply get null
or 0
. If it does have a left child, that child becomes a replacement for the largestValue
node. So you're right; the code doesn't take into account the scenario that the largestValue
node might have a left child.
Another EDIT
Since you've only posted a snippet, I'm not sure what the context of the code is. But the snippet as posted does seem to have the problem you suggest (replacing the wrong node). Usually, there are three cases, but I notice that the comment in your snippet says //Case 4
(so maybe there is some other context).
Earlier, I alluded to the fact that delete
usually comes with a replace
. So if you find the largestValue
node, you delete it according to the two simple cases (node with no children, and node with one child). So if you're looking at pseudo-code to delete a node with two children, this is what you'll do:
get largestValue from nodeToRemove.Left
nodeToRemove.Value <- largestValue.Value
//now replace largestValue with largestValue.Left
if largestValue = largestValue.Parent.Left then
largestValue.Parent.Left <- largestValue.Left //is largestValue a left child?
else //largestValue must be a right child
largestValue.Parent.Right <- largestValue.Left
if largestValue.Left is not null then
largestValue.Left.Parent <- largestValue.Parent
I find it strange that a Data Structures And Algorithms book would leave out this part, so I am inclined to think that the book has further split up the deletion into a few more cases (since there are three standard cases) to make it easier to understand.
To prove that the above code works, consider the following tree:
8
/ \
7 9
Let's say that you want to delete 8
. You try to find largestValue
from nodeToRemove.Left
. This gives you 7
since the left sub-tree only has one child.
Then you do:
nodeToRemove.Value <- largestValue.Value
Which means:
8.value <- 7.Value
or
8.Value <- 7
So now your tree looks like this:
7
/ \
7 9
You need to get rid of the replacement node and so you're going to replace largestValue
with largestValue.Left
(which is null
). So first you find out what kind of child 7
is:
if largestValue = largestValue.Parent.Left then
Which means:
if 7 = 7.Parent.Left then
or:
if 7 = 8.Left then
Since 7
is 8
's left child, need to replace 8.Left
with 7.Right
(largestValue.Parent.Left <- largestValue.Left
). Since 7
has no children, 7.Left
is null. So largestValue.Parent.Left
gets assigned to null (which effectively removes its left child). So this means that you end up with the following tree:
7
\
9