views:

187

answers:

1

I am writing a sliding window compression algorithm (LZ77) that searches for phrases in a "moving" dictionary.

So far I have written a BST where each node is stored in an array and it's index in the array is also the value of the starting position in the window itself.

I am now looking at transforming the BST to an AVL tree. I am a little confused at the sample implementations I have seen. Some only appear to store the balance factors whereas others store the height of each tree.

Are there any performance advantage/disadvantages of storing the height and/or balance factor for each node? Apologies if this is a very simple question, but I'm still not visualizing how I want to restructure my BST to implement height balancing.

Thanks.

+4  A: 

The balance is simply the difference between the two heights, so there shouldn't be any significant performance differences, except when the integer subtraction is implemented very slowly. ;) Storing the heights may require more memory if you simply store them as ints, without compressing both into a single int, which you could safely do as the balance guarantees a practical limit to the maximum height. But premature optimization, you know... With the heights you have more information available that you would need to calculate with an additional subtree traversal when you only have the balance available, but that depends on your requirements.

I recommend a deep study of Ben Pfaff's excellent introduction to BSTs: http://www.stanford.edu/~blp/avl/libavl.html/

Secure
Thanks for the link. I thought if storing the heights rather than balance factor, you must ascend up the tree doing corrections up until the root. I guess the same applies to balance factors though.
nn