views:

5324

answers:

7

Ok, this is another one in the theory realm for the CS guys around.

In the 90's I did fairly well in implementing BST's. The only thing I could bever get my head around was the intricacy of the algorithm to balance a Binary Tree (AVL).

Can you guys help me on this?

+9  A: 

I don't think it's good to post complete codes for node balancing algorithms here since they get quite big. However, the Wikipedia article on red-black trees contains a complete C implementation of the algorithm and the article on AVL trees has several links to high-quality implementations as well.

These two implementations are enough for most general-purpose scenarios.

Konrad Rudolph
Don't forget splay trees.
Nick Johnson
I've unaccepted your answer because it does not explain, only links. My intent was to get a descriptive answer that would help. Thanks anyway!
Gustavo Carreno
Your question asked code, not explanations. Anyway, I didn't intend my answer to be accepted anyway (because it's not really an answer) … Your new question is much better!
Konrad Rudolph
+5  A: 

A scapegoat tree possibly has the simplest balance-determination algorithm to understand. If any insertion causes the new node to be too deep, it finds a node around which to rebalance, by looking at weight balance rather than height balance. The rule for whether to rebalance on delete is also simple. It doesn't store any arcane information in the nodes. It's trickier to prove that it's correct, but you don't need that to understand the algorithm...

However, unlike an AVL it isn't height-balanced at all times. Like red-black its unbalance is bounded, but unlike red-black it's tunable with a parameter, so for most practical purposes it looks as balanced as you need it to be. I suspect that if you tune it too tightly, though, it ends up as bad or worse than AVL for worst-case insertions.

Response to question edit

I'll provide my personal path to understanding AVL trees.

First you have to understand what a tree rotation is, so ignore everything else you've ever heard the AVL algorithms and understand that. Get straight in your head which is a right rotation and which is a left rotation, and what each does to the tree, or else the descriptions of the precise methods will just confuse you.

Next, understand that the trick for balancing AVL trees is that each node records in it the difference between the height of its left and right subtrees. The definition of 'height balanced' is that this is between -1 and 1 inclusive for every node in the tree.

Next, understand that if you have added or removed a node, you may have unbalanced the tree. But you can only have changed the balance of nodes which are ancestors of the node you added or removed. So, what you're going to do is work your way back up the tree, using rotations to balance any unbalanced nodes you find, and updating their balance score, until the tree is balanced again.

The final part of understanding it is to look up in a decent reference the specific rotations used to rebalance each node you find: this is the "technique" of it as opposed to the high concept. You only have to remember the details while modifying AVL tree code or maybe during data structures exams. It's years since I last had AVL tree code so much as open in the debugger - implementations tend to get to a point where they work and then stay working. So I really do not remember. You can sort of work it out on a table using a few poker chips, but it's hard to be sure you've really got all the cases (there aren't many). Best just to look it up.

Then there's the business of translating it all into code.

I don't think that looking at code listings helps very much with any stage except the last, so ignore them. Even in the best case, where the code is clearly written, it will look like a textbook description of the process, but without the diagrams. In a more typical case it's a mess of C struct manipulation. So just stick to the books.

Steve Jessop
I'm accepting your answer because it was what I wanted from the question: A nice description of what should be done.
Gustavo Carreno
Glad you like it - Konrad's links to Wikipedia are also useful, since they provide the details I've left out.
Steve Jessop
+1  A: 

I agree, a complete program would not be appropriate.

While classical AVL and RB tree use a deterministic approach, I would suggest to have a look at "Randomized binary search trees" that are less costly to keep balanced and guarantee a good balancing regardless the statistical distribution of the keys.

Remo.D
A: 

The AVL tree is inefficient because you have to do potentially many rotations per insertion/deletion.

The Red-Black tree is probably a better alternative because insertions/deletions are much more efficient. This structure guarantees the longest path to a leaf is no more than twice the shortest path. So while less balanced than an AVL tree, the worst unbalanced cases are avoided.

If your tree will be read many times, and won't be altered after it's created, it may be worth the extra overhead for a fully-balanced AVL tree. Also the Red-Black tree requires one extra bit of storage for each node, which give the node's "color".

Personally speaking, I have never found an actual *explanation* of RB trees - merely listing of the rules. No one seems to understand the *why* of RB. OTOH, AVL was for me intuitive and straightforward to comprehend; and you should only write code you comprehend.
Blank Xavier
The "why": RB trees are a mapping of 2-3-4 trees onto binary trees, where red edges connect the portions of a split-up 2-3-4 tree node and black edges correspond to the edges in the original 2-3-4 tree.
Jeffrey Hantin
One thing, about the "potentially many rotations per insertion/deletion". AVL insertions only need one single rotation, or double rotation, to restore the balance. But yes, Delete might have up to O(log N) rotations.
otherchirps
+2  A: 

I've been doing some work with AVL trees lately.

The best help I was able to find on how to balance them was through searching google.

I just coded around this pseudo code (if you understand how to do rotations it is pretty easy to implement).

IF tree is right heavy
{
  IF tree's right subtree is left heavy
  {
     Perform Double Left rotation 
  }
  ELSE
  {
     Perform Single Left rotation
  }
}
ELSE IF tree is left heavy
{
  IF tree's left subtree is right heavy
  {
     Perform Double Right rotation
  }
  ELSE
  {
     Perform Single Right rotation
  }
}
mc6688
This part of AVL is fairly straightforward - what gets a bit trickier is updating the balance factors after the rotations.
Blank Xavier
A: 

For balancing an AVL Tree I just wrote a bunch of functions which I thought of sharing with everyone here. I guess this implementation is flawless. Queries/questions/criticism are of course welcome:

http://uploading.com/files/5883f1c7/AVL_Balance.cpp/

Being a novice to Stackoverflow, I tried posting my code here but was stuck with bad formatting issues. So, uploaded the file on the above link.

Cheers.

Rocky
+1  A: 

Good link for AVL tree rotations: http://www.scribd.com/doc/8203669/AVLTree-rotation-Tutorial

Scooby