views:

1065

answers:

6

How do I find the distance between two nodes in a binary tree? Equivalently, what algorithms are there for finding the most recent common ancestor (lowest common ancestor) of two nodes?

+3  A: 

Finding the common ancestor is almost certainly the easier task. This is a pretty simple one: start from the root of the tree, and descend the tree until you reach a node where you would have to descend to different children to get to the two nodes in question. That node is the common parent (assuming the tree contains both nodes, of course).

Jerry Coffin
For the distance between the two nodes, do a find for each node from the common parent while counting the edges traversed. The distance is the sum of the two edge counts.
Pascal Cuoq
If you have an "up" link at each node *and* a way of comparing two nodes to see if one is to the left of the other (e.g. it's a binary search tree), then you can find the common ancestor without starting at the top. But it's kind of nontrivial, and it would only be better when the two nodes are expected to be close together.
Jason Orendorff
It's not that simple. If you want to go fast, you should do the computation bottom-up (like the received answer does), not top-down as you propose, because if going top-down how can you find/guess the path that leads to the two nodes ? You'll have to do some expensive search.
@fred-hh: first of all, the accepted answer also proposes a top-down solution. Second, if you only want the common ancestor, what it advises is a considerably MORE expensive algorithm -- what I propose only descends the tree to the point that you reach the common ancestor, whereas the accepted answer requires descending the tree all the way to both nodes in question. Third, unless there's more to the tree than stated in the question, a bottom-up solution simply **is not possible**. A normal binary tree doesn't have any pointers from leaf nodes to root nodes, only from root to leaves.
Jerry Coffin
@Jerry Coffin: What fred-hh is saying is that it is not given that the tree is a binary search tree. Only a binary tree. The data is not necessarily ordered at all.
Jason Orendorff
@Jason Orendorff:Ah, I hadn't thought of that. Yes, given a tree with the nodes in random order, the search can become unreasonable. At the same time, this seems to make little real sense if the elements are in a random order. For that matter, storing items in a tree at all makes little sense unless you're maintaining an order.
Jerry Coffin
Lots of trees aren't ordered—parse trees, DOM documents, filesystems, Lisp data structures built out of cons cells—but yeah, those are almost never binary trees. (Finding the common ancestor of two DOM nodes or two files is a reasonable thing to want to do, but those aren't binary trees...)
Jason Orendorff
A: 

Make two sets consisting of the ancestors of each: while the union of the sets is empty, add the next ancestor of each node to the appropriate list. Once there is a common node, that's the common ancestor.

Eli
+3  A: 
  1. calculate the list of ancestors for each node
  2. find the common prefix
  3. the last element from the common prefix is the lowest common ancestor
  4. remove the common prefix from both ancestor lists
  5. the distance is the sum of lengths of the remaining lists +1
Javier
IMO, this is not a particularly good answer to the question as it was asked. This requires descending the tree all the way to both nodes in question, even though the common ancestor can be determined by descending only to that common ancestor. If the common ancestor is a sufficient answer, descending all the way to both nodes to find it is pointlessly inefficient.
Jerry Coffin
@Jerry Coffin: totally agree, it's _far_ from being an optimal answer. it's just a very simple one, and for most non-huge trees it's quick enough (linear on tree depth). Another plus of this simplistic answer is that you only need high-level access to the tree libraries, you don't have to be able to descend the tree yourself.
Javier
+1  A: 

This article (from TopCoder) has pretty detailed discussion on various methods to solve the LCA problem (as well as the related Range Minimum Query problem). The O(sqrt(height)) solution is quite fast and is the easiest to code.

MAK
+3  A: 

As everybody here seems to know, if you keep a note of the distance each node is from the root, then once you have found the lowest common ancestor of the two nodes you can work out the distance they are from each other in constant time.

If you do one time work only linear in the size of the tree it turns out that you can then find the lowest common ancestor of any two nodes in constant time (no matter how deep the tree is). See http://en.wikipedia.org/wiki/Lowest_common_ancestor

The Baruch Schieber and Uzi Vishkin algorithm for lowest common ancestor is entirely practical to use and to program.

mcdowella
Carl