ansaurus

Question

How to determine if binary tree is balanced?

Answer 1

+7 A:

This only determines if the top level of the tree is balanced. That is, you could have a tree with two long branches off the far left and far right, with nothing in the middle, and this would return true. You need to recursively check the root.left and root.right to see if they are internally balanced as well before returning true.

Jesse Rusak 2009-04-13 02:03:00

Answer 2

+1 A:

Well, you need a way to determine the heights of left and right, and if left and right are balanced.

And I'd just return height(node->left) == height(node->right);

As to writing a height function, read: http://stackoverflow.com/questions/717725/understanding-recursion/717839#717839

tpdi 2009-04-13 02:07:22

You want left and right heights to be within 1, not necessarily equal.

Alex B 2009-04-13 03:36:28

Answer 3

+1 A:

What kind of tree are you talking about? There are self-balancing trees out there. Check their algorithms where they determine if they need to reorder the tree in order to maintain balance.

lothar 2009-04-13 02:08:29

Answer 4

+2 A:

Balancing usually depends on the length of the longest path on each direction. The above algorithm is not going to do that for you.

What are you trying to implement? There are self-balancing trees around (AVL/Red-black). In fact, Java trees are balanced.

Uri 2009-04-13 02:10:47

Answer 5

+3 A:

If this is for your job, I suggest:

do not reinvent the wheel and
use/buy COTS instead of fiddling with bits.
Save your time/energy for solving business problems.

Steven A. Lowe 2009-04-13 02:14:55

Answer 6

A:

So I changed the code, but it doesnt seem to be working. ps. I do have a method to determine the height. it's just height();

Any ideas on how to fix it.

public boolean heightBalanced(BinaryTree<T> tree){
 if(tree==null){
  return true;  //tree is empty
 }
 else if(tree.left!=null && tree.right!=null){
  int lh = tree.height(tree.left);
  int rh = tree.height(tree.right);
  if(lh - rh > 1 || rh - lh > 1){
   return false;
  }
  tree.heightBalanced(tree.left);
  tree.heightBalanced(tree.right);
 }
 return true;
 }

2009-04-13 02:21:50

You're ignoring the return values from the 2 recursive calls to heightBalanced. Perhaps you want to return true only if they both return true.

Jesse Rusak 2009-04-13 02:23:53

Your method returns tree if tree.left != null and tree.right = 'a huge subtree'.Also calling height(..) and heightBalanced(...) on the same subtree is rather inefficient. I would look at returning enough information from heightBalanced to avoid computing the height of each subtree twice

Rob Walker 2009-04-13 03:12:18

Replace your else if block with an else block. Set lh and rh to 0 if the tree is null. Replace `tree.heightBalanced(tree.left)` with `tree.left == null || tree.heightBalanced(tree.left)`. Do the same for tree.right.

Brian 2010-02-02 14:35:59

You do have a method to determine height, but it (depending on how your tree works) might not be O(1). If it isn't, then using that method will make your algorithm asymptotically slower...though that might not matter for your purposes.

Brian 2010-02-02 17:45:48

Answer 7

+2 A:

What balanced means depends a bit on the structure at hand. For instance, A B-Tree cannot have nodes more than a certain depth from the root, or less for that matter, all data lives at a fixed depth from the root, but it can be out of balance if the distribution of leaves to leaves-but-one nodes is uneven. Skip-lists Have no notion at all of balance, relying instead on probability to achieve decent performance. Fibonacci trees purposefully fall out of balance, postponing the rebalance to achieve superior asymptotic performance in exchange for occasionally longer updates. AVL and Red-Black trees attach metadata to each node to attain a depth-balance invariant.

All of these structures and more are present in the standard libraries of most common programming systems (except python, RAGE!). Implementing one or two is good programming practice, but its probably not a good use of time to roll your own for production, unless your problem has some peculiar performance need not satisfied by any off-the-shelf collections.

TokenMacGuy 2009-04-13 02:32:08

Answer 8

A:

Wouldn't this work?

return ( ( Math.abs( size( root.left ) - size( root.right ) ) < 2 );

Any unbalanced tree would always fail this.

Sumit Kishore 2009-05-28 18:44:38

Many balanced trees will fail it, too.

Brian 2010-02-02 20:45:17

Answer 9

+16 A:

Stumbled across this old question while searching for something else. I notice that you never did get a complete answer.

The way to solve this problem is to start by writing a specification for the function you are trying to write.

Specification: A well-formed binary tree is said to be "height-balanced" if (1) it is empty, or (2) its left and right children are height-balanced and the height of the left tree is within 1 of the height of the right tree.

Now that you have the specification, the code is trivial to write. Just follow the specification:

IsHeightBalanced(tree)
    return (tree is empty) or 
           (IsHeightBalanced(tree.left) and
            IsHeightBalanced(tree.right) and
            abs(Height(tree.left) - Height(tree.right)) <= 1)

Translating that into the programming language of your choice should be trivial.

Bonus exercise: this naive code sketch traverses the tree far too many times when computing the heights. Can you make it more efficient?

Super bonus exercise: suppose the tree is massively unbalanced. Like, a million nodes deep on one side and three deep on the other. Is there a scenario in which this algorithm blows the stack? Can you fix the implementation so that it never blows the stack, even when given a massively unbalanced tree?

Eric Lippert 2010-02-01 16:33:33

+1 for only correct answer, I can't believe no one was able to answer this for 8 months...

BlueRaja - Danny Pflughoeft 2010-02-01 20:09:14

Answer to the "exercises" below…

Potatoswatter 2010-02-02 02:56:51

Answered Bonus exercise below.

Brian 2010-02-02 14:25:54

Answer 10

A:

Here is a version based on a generic depth-first traversal. Should be faster than the other correct answer and handle all the mentioned "challenges." Apologies for the style, I don't really know Java.

You can still make it much faster by returning early if max and min are both set and have a difference >1.

public boolean isBalanced( Node root ) {
    int curDepth = 0, maxLeaf = 0, minLeaf = INT_MAX;
    if ( root == null ) return true;
    while ( root != null ) {
        if ( root.left == null || root.right == null ) {
            maxLeaf = max( maxLeaf, curDepth );
            minLeaf = min( minLeaf, curDepth );
        }
        if ( root.left != null ) {
            curDepth += 1;
            root = root.left;
        } else {
            Node last = root;
            while ( root != null
             && ( root.right == null || root.right == last ) ) {
                curDepth -= 1;
                last = root;
                root = root.parent;
            }
            if ( root != null ) {
                curDepth += 1;
                root = root.right;
            }
        }
    }
    return ( maxLeaf - minLeaf <= 1 );
}

Potatoswatter 2010-02-01 20:47:02

A nice attempt but it clearly does not work. Let x be a null node. Let a non-null tree node be denoted as (LEFT VALUE RIGHT). Consider the tree (x A (x B x)). "root" points to nodes A, B, A, B, A, B ... forever. Care to try again? A hint: it's actually easier *without* parent pointers.

Eric Lippert 2010-02-02 03:18:48

@Eric: Oops, fixed (I think). Well, I'm trying to do this without O(depth) memory, and if the structure doesn't have parent pointers (it often does), you need to use a stack.

Potatoswatter 2010-02-02 04:50:20

So what you're telling me is you'd rather use O(n) permanent memory in parent pointers to avoid allocating O(d) temporary memory, where log n <= d <= n ? This seems like a false economy.

Eric Lippert 2010-02-02 06:20:00

Unfortunately, though you've fixed the problem with the traversal, there's a far bigger problem here. This does not test whether a tree is balanced, it tests whether a tree has all its leaves close to the same level. That's not the definition of "balanced" that I gave. Consider the tree ((((x D x) C x) B x) A x). Your code reports that this is "balanced" when it obviously is maximally unbalanced. Care to try again?

Eric Lippert 2010-02-02 06:37:09

@Eric reply 1: not a false economy if you already use the parent pointers for something else. reply 2: sure, why not. This is a bizarre way of debugging… I shouldn't be blindly writing traversals of anything at 4 am…

Potatoswatter 2010-02-02 10:48:57

Also, note that a height-balanced tree can have leaves whose depths differ by two. For example, ((((x H x) D x) B (x E x)) A (((x I x) F x) C ((x J x) G (x K (x L x))))) is height balanced, but the depth of leaf E is two less than the depth of L.

Eric Lippert 2010-02-02 17:41:46

@Eric: There are various definitions of balance. A red-black tree's leaves can vary in depth by a factor of two. You can adjust the `return` expression accordingly.

Potatoswatter 2010-02-02 20:13:55

Indeed, there are many definitions of balance. However, it's still not the case that you can work out whether a given tree is balanced according to my definition solely by locating the deepest and shallowest leaves. I can construct you height-balanced trees where the deepest and shallowest leaves are arbitrarily far apart, given enough nodes. Now, if you disagree with my claim that your definition of height-balanced is not the same as mine then I encourage you to attempt a formal proof that they are equivalent.

Eric Lippert 2010-02-02 21:13:36

@Eric: Absolutely, they are not equivalent. Not all trees with max_leaf_depth <= 2 * min_leaf_depth are valid red-black trees! However, that's beyond the scope of the OP. I used the most stringent definition of balance, which happens to be universal. As for practicality, I confess I don't know the application of validating a tree's balance without simultaneously balancing it.

Potatoswatter 2010-02-02 21:29:13

You can fix your implementation by tracking the depth of not the lowest and highest *leaves* but the lowest and highest *nodes that have any null child*. You'll still be testing for a different definition of "balanced" than the one I proposed; as you note, yours is a considerably more restricting definition. It is an interesting challenge to try and implement a tester for my more relaxed definition without consuming unbounded call stack space.

Eric Lippert 2010-02-02 23:04:29

Also, the reason to validate a tree's balance without balancing it is to enable you to write a test suite that verifies that your tree balancer is correct!

Eric Lippert 2010-02-02 23:05:30

@Eric: That is the change I applied last night. As for an AVL tester with O(1) memory (given parent links), just use this answer as a template for an O(1)-memory depth-first traversal, and perform a nested depth-first traversal at each node to find the heights of its subtrees. That will use O(1) memory (but O(N) parent links) and O(N^2) time.

Potatoswatter 2010-02-02 23:44:02

@Eric: interestingly, the parent links may actually optimize something in that algorithm: the checking algorithm may be multithreaded and share a common set of parent links, potentially using *less* memory than a multithreaded version where each thread has a stack.

Potatoswatter 2010-02-02 23:48:59

Ah, so you did. Clearly I did not read the change carefully enough!

Eric Lippert 2010-02-03 00:31:27

@Eric: no prob ;v)

Potatoswatter 2010-02-03 00:35:48

Answer 11

+1 A:

Bonus exercise response. The simple solution. Obviously in a real implementation one might wrap this or something to avoid requiring the user to include height in their response.

IsHeightBalanced(tree, out height)
    if (tree is empty)
        height = 0
        return true
    balance = IsHeightBalanced(tree.left, heightleft) and IsHeightBalanced(tree.right, heightright)
    height = max(heightleft, heightright)+1
    return balance and abs(heightleft - heightright) <= 1

Brian 2010-02-02 14:25:40

Nicely done! -------

Eric Lippert 2010-02-02 14:50:21

Answer 12

A:

Here's what i have tried for Eric's bonus exercise. I try to unwind of my recursive loops and return as soon as I find a subtree to be not balanced.

int heightBalanced(node *root){
    int i = 1;
    heightBalancedRecursive(root, &i);
    return i; 
} 

int heightBalancedRecursive(node *root, int *i){

    int lb = 0, rb = 0;

    if(!root || ! *i)  // if node is null or a subtree is not height balanced
           return 0;  

    lb = heightBalancedRecursive(root -> left,i);

    if (!*i)         // subtree is not balanced. Skip traversing the tree anymore
        return 0;

    rb = heightBalancedRecursive(root -> right,i)

    if (abs(lb - rb) > 1)  // not balanced. Make i zero.
        *i = 0;

    return ( lb > rb ? lb +1 : rb + 1); // return the current height of the subtree
}

Sudharsanan 2010-04-07 21:10:43

Answer 13

+2 A:

Balance is a truly subtle property; you think you know what it is, but it's so easy to get wrong. In particular, even Eric Lippert's (good) answer is off. That's because the notion of height is not enough. You need to have the concept of minimum and maximum heights of a tree (where the minimum height is the least number of steps from the root to a leaf, and the maximum is... well, you get the picture). Given that, we can define balance to be:

A tree where the maximum height of any branch is no more than one more than the minimum height of any branch.

(This actually implies that the branches are themselves balanced; you can pick the same branch for both maximum and minimum.)

All you need to do to verify this property is a simple tree traversal keeping track of the current depth. The first time you backtrack, that gives you a baseline depth. Each time after that when you backtrack, you compare the new depth against the baseline

if it's equal to the baseline, then you just continue
if it is more than one different, the tree isn't balanced
if it is one off, then you now know the range for balance, and all subsequent depths (when you're about to backtrack) must be either the first or the second value.

In code:

class Tree {
    Tree left, right;
    static interface Observer {
        public void before();
        public void after();
        public boolean end();
    }
    static boolean traverse(Tree t, Observer o) {
        if (t == null) {
            return o.end();
        } else {
            o.before();
            try {
                if (traverse(left, o))
                    return traverse(right, o);
                return false;
            } finally {
                o.after();
            }
        }
    }
    boolean balanced() {
        final Integer[] heights = new Integer[2];
        return traverse(this, new Observer() {
            int h;
            public void before() { h++; }
            public void after() { h--; }
            public boolean end() {
                if (heights[0] == null) {
                    heights[0] = h;
                } else if (Math.abs(heights[0] - h) > 1) {
                    return false;
                } else if (heights[0] != h) {
                    if (heights[1] == null) {
                        heights[1] = h;
                    } else if (heights[1] != h) {
                        return false;
                    }
                }
                return true;
            }
        });
    }
}

I suppose you could do this without using the Observer pattern, but I find it easier to reason this way.

[EDIT]: Why you can't just take the height of each side. Consider this tree:

        /\
       /  \
      /    \
     /      \_____
    /\      /     \_
   /  \    /      / \
  /\   C  /\     /   \
 /  \    /  \   /\   /\
A    B  D    E F  G H  J

OK, a bit messy, but each side of the root is balanced: C is depth 2, A, B, D, E are depth 3, and F, G, H, J are depth 4. The height of the left branch is 2 (remember the height decreases as you traverse the branch), the height of the right branch is 3. Yet the overall tree is not balanced as there is a difference in height of 2 between C and F. You need a minimax specification (though the actual algorithm can be less complex as there should be only two permitted heights).

Donal Fellows 2010-04-07 21:53:12

Ah, good point. You could have a tree which is h(LL)=4, h(LR)=3, h(RL)=3, h(RR)=2. Thus, h(L)=4 and h(R)=3, so it would appear balanced to the earlier algorithm, but with a max/min depth of 4/2, this is not balanced. This would probably make more sense with a picture.

Tim 2010-04-07 22:04:32

That's what I've just added (with the world's nastiest ASCII graphic tree).

Donal Fellows 2010-04-07 22:10:03

Answer 14

A:

Here's a non recursive solution. Doesn't calculate any height.

int heightBalanced(node* root){

  int flag = 0 ; // tells how many children a node does NOT have. valid values - 0, 1, or 2. if value = 2, then leaf node
  int parentFlag = 0 ; // tells how many children the parent of the current node DOESNOT have. valid values 0 or 1

  node *n;
  Stack s;  // can have this allocated in heap also
  s.push(root);

  do{        // do a pre order traversal using stack 

       n = s.pop();
       if (n -> right) s.push( n -> right);
       else flag++;

       if( n -> left) s.push( n -> left);
       else flag ++;

       if(parentFlag && flag != 2) // parent node has only one child and that child is not leaf. height not balanced.
         return 0;
       parentFlag = (flag == 1 ? 1 : 0) ; // parent flag is set only if flag is 1
       flag = 0 ;

   } while (!s.empty);

   return 1;

}

Sudharsanan 2010-04-07 21:55:48

ok..I just realized that this solution will not work if all nodes have 2 children but the leaves are at different levels! ignore it..

Sudharsanan 2010-04-07 22:21:08

ansaurus

tags:

views:

answers:

How to determine if binary tree is balanced?

related questions