views:

179

answers:

4

The OCaml standard library has a wonderful Set implementation that uses a very efficient divide-and-conquer algorithm to compute the union of two sets. I believe it takes whole subtrees (not just single elements) from one set and inserts them into the other set, rebalancing when necessary.

I'm wondering if this requires the height information that is kept in the AVL tree that OCaml uses or if this is also possible with red-black trees. For example, is it possible to concatenate a pair of red-black trees more efficiently than simply iterating over the second tree appending its elements to the end of the first tree?

A: 

You can copy the elements in linear time using a R-B tree. I guess in theory you could get sublinear complexity by transplanting an entire subtree at once -- a R-B tree stores relatively subtree height information that's kind of similar to an AVL tree, so you should be able to find the deepest part of the new subtree by traversing only the deeper of the sub-trees, in ~log(N) time. That would tell you how much you need to adjust your existing (target) tree so you can splice that entire sub-tree on and have the result balanced.

After that, you'd often have to re-balance though, especially if the target tree was smaller than the source. For an extreme case, consider a target with a single node and a source that's (say) 10 levels deep. You can't do anything to a single node so that you can splice something 10 levels deep onto it and have it immediately come out balanced. You'd probably want to avoid this (at least in general) by using the smaller tree as the source, and the larger as the target. You might still need a post-splice re-balance, but it'd be a lot less common this way.

Jerry Coffin
A: 

You may win something when you'll combine tree with low overlap, but in general you'll have to reorgainze nodes. With balancing you have worse situation as there probably rules for rotating after touching only one node.

ony
+3  A: 

Since you seem be talking about Concatenate + adding onto the end, it seems like you have the following problem:

Given two red-black trees T1 and T2, such that keys of T1 <= keys of T2,
find union of the two.

This is called the join operation on the trees and in this case, it is possible to do the join of the trees in O(log n) time, check out: http://www.cs.tau.ac.il/~wein/publications/pdfs/rb_tree.pdf

Also check out: http://net.pku.edu.cn/~course/cs101/resource/Intro2Algorithm/book6/chap14.htm, Problem 14.2.

Moron
Looks like they only did it in O(log n) by augmenting the tree with height information in each node, at which point it is no longer an ordinary red-black tree.
Jon Harrop
@Jon. Even with that information, we can consider it a red-black tree. It does not change the fact that inserts/deletes etc are still O(logn) and the node colours follow the rb tree invariants. In any case, I don't see how else :-)
Moron
@Moron: The only advantage of a red-black tree is that the auxiliary information (red or black) is only 1-bit per branch. By adding height, you've lost that advantage and might as well just use a height-balanced tree instead.
Jon Harrop
Moron
How do you do it in O(log^2 n)?
Jon Harrop
@Jon: I didn't look into the details of the algorithm, but whenever you need the height, just recalculate it in O(logn) time. At worst, that would make the algorithm O(log^2(n)).
Moron
+9  A: 
jbapple
Upvote has fixed your karma problem. Any chance you could come back and edit? This seems like a potentially good answer that could be significantly improved with clickable links.
spong
Wonderful, thank you!
Jon Harrop
@spong: yes, I'll do that right now. Thanks for reminding me.
jbapple
+1. for a well researched answer!
Moron