views:

243

answers:

1

I have a library of linked list/binary tree methods for use when standard containers don't fit - e.g. when there are different types of nodes, or when I need to convert from binary tree to list and back again. It includes red-black tree handling.

One of the methods converts from a double-linked list to a perfectly balanced simple binary tree in O(n) time (given that the number of items is known in advance). The algorithm is known as "folding" - it's the second half of a binary tree rebalancing algorithm that was once published in Dr. Dobbs IIRC. The steps are basically...

  • Given the size of the tree, decide on the sizes of the left and right subtrees

  • Recurse for the left subtree

  • Pop a node from the list to use as the root

  • Recurse for the right subtree

  • Link the subtrees to the root

I also have a similar method that creates a red-black tree. The principle is the same, but the recursion keeps track of node height - height zero nodes are created red, all others are black. The starting height calculation is based on the highest set bit in the tree size, and is fiddled so that a perfectly balanced (2^n)-1 sized tree has only black nodes (the recursion only goes down to height one).

The point here is that I only have red nodes at the leaf level, and a maximum of precisely half the nodes are red.

The thing is, while this is a simple way to generate a valid red-black tree, it isn't the only option. Avoiding having all leafs red in a perfectly balanced tree was an arbitrary choice. I could have alternating layers of red and black nodes. Or I could reduce the number of red nodes dramatically in some cases by spotting subtrees that are perfectly balanced and (if it needs red nodes) making the subtree root red instead of all its leaves.

The question is - is there any practical reason to choose one valid red-black tree form over another?

This is pure curiosity - I know I don't have any practical reason - but does anyone know of a specialist application where this choice is significant?

+1  A: 

The short answer is: it depends.

Basically, any valid tree will suffice. However, in terms of amortized analysis - it might very possibly be that you will want to choose the most correct tree that in the long run will give you the most optimized behavior.

e.g. if you always choose a valid tree, but one that is prone to lots of balancing operations, you will get bad amortized performance. An obvious example is a fully-black tree, which is perfectly valid, yet performs bad when modified.

It depends, because this usually will be application-specific.

Yuval A
Why is a fully black tree bad? I assumed it'd be good for inserting more items - you just get new red leaf nodes with no rebalancing needed. A delete is worse - I can see that having red nodes at leaf level is handy as a supply of replacements for the deleted nodes - but in my experience, random deletions are relatively unusual. Mostly I delete everything. Next most often is selective deletion in a traversal - I ignore the red-black delete algorithm and deconstruct-select-and-rebuild in O(n) time. My red-black delete is a bit neglected, really.
Steve314
@Steve - that is exactly why I said it's application specific :)
Yuval A