views:

525

answers:

5

Given a binary tree, I want to find out the largest subtree which is a BST in it.

Naive approach:

I have a naive approach in mind where I visit every node of the tree and pass this node to a isBST function. I will also keep track of the number of nodes in a sub-tree if it is a BST.

Is there a better approach than this ?

A: 

I would think you could avoid checking if every node is the root of a BST by working top down instead of bottom up. If a subtree is a BST it's going to be larger than any subtree in itself so you don't need to check inside a subtree if it has passed the isBST test. Then you just have isBST return the size of a valid tree and store that along with a pointer to the root of the subtree if you need to be able to find it again instead of just knowing how large the largest one is.

UPDATE:

Some of the code posted here to check if something is a BST are going to fail some cases since they're only checking the parent of a node.

Take for example:

       10
     /    \
   4      99
          /
         2

This is not a valid BST, (the 2 is out of position with regards to the 10) but if you don't send a min and max value down through the tree you will incorrectly verify it as valid. This pseudocode takes that into account.

main
{
    Verify(root, MIN_VALUE, MAX_VALUE)
}

boolean Verify(node , min, max)
{

 if(node == null)
   return true;

  if(node.value > min &&
     node.value < max &&
     Verify(node.leftchild, min, node.value) &&
     Verify(node.rightchild,node.value,max)
  {
      return true;
  }
  else
  {
      return false;
  }
}
Colin
thts a nice optimization !!
rakeshr
+2  A: 

The tree is a BST if its in-order traversal gives you its elements in sorted order. You can use this code here if you want an example implementation: http://placementsindia.blogspot.com/2007/12/c-program-to-check-whether-binary-tree.html

The running time is O(N) where N = number of nodes.

Considering the tree a BST if the root's two subtrees are both BST is wrong (and to the person who deleted his answer that proposed this solution: you shouldn't have deleted your answer, personally I wasn't going to downvote you and there is as much to learn from a bad-but-seemingly-good solution as there is from a good one). Counterexample:

    3
   / \
  2   4
 / \
1  5

Now, to get the largest subtree that is a BST, consider this tree:

    3
   / \
  2   4
 / \
1  5

The inorder-traversal is 1 2 5 3 4. I think you can solve your original problem by finding the maximum-length sorted contiguous subsequence in the inorder-traversal. You just have to be careful not to select sequences that don't describe a BST. For example, for:

    10
   / \
  2   14
 / \  |
1  5  20

The inorder-traversal is 1 2 5 10 20 14. Don't select the 20. I think this can be accomplished by making sure you dismiss elements as long as their selection stops making sense. For example, when you reach 14, dismiss the 20. I'm not sure if this can be efficiently done however. I'll edit my post if I find an exact way.

IVlad
Actually for some reason, I had it mixed up in my head (with someone else suggesting "balanced tree") so I was thinking of the balanced tree instead of BST. You're right of course.
Larry
+3  A: 

I guess the problem you're trying to solve is finding the largest (with more nodes) BST in BT. In that case you'll need to traverse all the tree nodes checking if it is a BST, and once you find one you'll have to check if it has more nodes than the largest one found till the moment.

class TreeNode
{
    public int value;
    public TreeNode left;
    public TreeNode right;
}

void LargestBST(TreeNode bt, IDictionary<TreeNode, bool> isBST, IDictionary<TreeNode, int> nodeCount, ref TreeNode largestBST)
{
    if (bt == null)
        return;
    if (IsBST(bt, isBST) && (largestBST == null || NodeCount(bt, nodeCount) > NodeCount(largestBST, nodeCount)) 
        largestBST = bt;
    else
    {
        LargestBST(bt.left, isBST, nodeCount, ref largestBST);
        LargestBST(bt.Right, isBST, nodeCount, ref largestBST);
    }
}

bool IsBST(TreeNode node, IDictionary<TreeNode, bool> isBST)
{
    if (node == null)
        return true;

    bool result;
    if (!isBST.TryGetValue(node, out result))
    {
        TreeNode maxLeft = Max(node.Left);
        TreeNode minRight = Min(node.Right);
        result = (maxLeft == null || maxLeft.value <= node.value) &&
                 (minRight == null || minRight.value >= node.value) &&
                 IsBST(node.left, isBST) && IsBST(node.Right, isBST);
        isBST.Add(node, result);
    }
    return result;
}

TreeNode Max(TreeNode node)
{
    if (node == null)
        return null;
    while (node.right != null)
        node = node.right;
    return node;
}

TreeNode Min(TreeNode node)
{
    if (node == null)
        return null;
    while (node.left != null)
        node = node.left;
    return node;
}

int NodeCount(TreeNode node, IDictionary<TreeNode, int> nodeCount)
{
    if (node == null)
        return 0;
    int result;
    if (!nodeCount.TryGetValue(node, out result))
    {
        result = 1 + NodeCount(node.left, nodeCount) + NodeCount(node.right, nodeCount);
        nodeCount.Add(node, result);
    }
    return result;
}
Fede
Good solution. Note that you can also store the min/max on the nodes for `O(n)` space for an easier analysis and runtime. (A sort of path-compression on the min-max values.)
Larry
+1  A: 
int getMinMaxValue(Node* root, bool isMin)
{
   if (!root)
   {
      // Not real limits...
      return (isMin ? INT_MAX : INT_MIN);
   }
   int leftVal = getMinMaxValue(root->left, isMin);
   int rightVal = getMinMaxValue(root->right, isMin);
   if (isMin)
   {
      return min(root->value, min(leftVal, rightVal));
   }
   else
   {
      return max(root->value, max(leftVal, rightVal));
   }
}

bool isBST(Node* root)
{
   if (!root)
   {
      return true;
   }

   Node* left = root->left;
   Node* right = root->right;

   if (left)
   {
      if (getMinMaxValue(left, false) > root->value)
      {
         return false;
      }
   }

   if (right)
   {
      if (getMinMaxValue(right, true) < root->value)
      {
         return false;
      }
   }

   return isBST(left) && isBST(right);
}

Then just descend from the root node checking if the subtree is BST, and take the largest one.

DevDevDev
This isn't correct. I'll try to give an example in the comments, hope formatting doesn't ruin the example.Well this is an edit of my initial comment cause the tree structure wasn't clear, so I'll describe it.The root value is 10, and it's left child is 5, and it's right child is 12. 12 is a leaf, and 5 has only a right child with value 30.According to your algorithm, that's a BST, when it's not.
Fede
Oh shit you are right! I corrected it.
DevDevDev
Hmmm I left up the solution, but now that I think about it, there is a loop invariant and I can find the min/max without have to actually look at values, because if the tree is not BST then eventually I will find the node that breaks it regardless. Still this works, but sucks lol
DevDevDev
A: 

To verify if a node is root of a BST , we have to recursively check each left and right children . If you start from root , you will have to recur all children before you can decide root of the binary tree is a BST or not . So it do not make any sense to call "isBST" for each node . Here's my approach

  1. Start from root
  2. Find max from left and right
  3. If not max on left and right return "NOT BST"
  4. if BST on left , check if it's bigger than the max so far . If yes store it and return "NOT BST"
  5. if BST on right, check if it's bigger than the max so far . If yes store it and return "NOT BST"
  6. If left and right are BST , there's a new BST with current root as ROOT and with number of nodes left + right + 1

Couple of challenges in making this work is storing max so far for which i used a ref variable MaxNumNodes. maxbst withh have the root of the biggest BST found when the function return .

public int MaxBST(Node root, int min, int max, ref Node maxbst, 
        ref int MaxNumNodes)
    {
        if (root == null) return 0;

        //Not a BST
        if (root.data < min || root.data > max) return -1;

        //Find Max BST on left
        int left = MaxBST(root.left, min, root.data, ref maxbst, 
                                    ref MaxNumNodes);
        //Find Max BST on right
        int right = MaxBST(root.right, root.data + 1, max, ref maxbst,
                                            ref MaxNumNodes);

        //Case1: -1 from both branches . No BST in both branches
        if (left == -1 && right == -1) return -1;

        //Case2:No BST in left branch , so choose right 
        //See if the BST on right is bigger than seen so far
        if (left == -1)
        {
            if (right> MaxNumNodes)
            {
                MaxNumNodes = right;
                maxbst = root.right;
            }
            return -1;
        }

        //Case3:No BST in right branch , so choose left 
        //See if the BST on left is bigger than seen so far
        if (right == -1)
        {
            if (left > MaxNumNodes)
            {
                MaxNumNodes = left;
                maxbst = root.left;
            }
            return -1;
        }

        //Case4:Both are BST , new max is left BST + right BST
        maxbst = root;
        return left + right + 1;

    }