views:

1587

answers:

2

Does anyone know how to figure out search time for a binary search tree(i.e. worst-case, best-case, and average-case)?

+6  A: 

Worst case is O(n), which is for the degenerate binary tree (a linked list).

In this case, you have to search, on average, half the list before finding your desired element.

Best case is O(log2 n) for a perfectly balanced tree, since you cut the search space in half for every tree level.

Average case is somewhere in between those two and depends entirely on the data :-)

Since you rarely get to control the sequence in which data is inserted into a tree, balanced trees are usually preferable since, while they add a small amount of time to each insertion or deletion, they greatly speed up searching. Their worst case is so much better than unbalanced trees.

                 8
         _______/ \_______
        /                 \
       4                  12
    __/ \__             __/ \__
   /       \           /       \
  2         6        10        14
 / \       / \       / \       / \
1   3     5   7     9  11    13  15

In this perfectly balanced tree, you can see that you get 2n-1 nodes for every n levels. That means for 15 nodes, you never have to search more than four nodes to find it (e.g., to find 13, you search 8, 12, 14 and 13). That's where the log2n figure comes from.

A degenerate unbalanced tree, as already stated, is a linked list. If your data arrived in sequence and you inserted it into an unbalanced binary tree, you'd get:

1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9 -+
                                           |
+------------------------------------------+
|
+-> 10 -> 11 -> 12 -> 13 -> 14 -> 15

To find 13 in that case, you'd need to search 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13, hence O(n).

paxdiablo
but what if its completely random numbers and no where near balanced, is there some kind of method to figure out the cases?
No, because it depends both on your tree and what you're searching for.
paxdiablo
Algorithms exist to balance trees on insert/delete. Thus, random data is no problem for the right implementation.
Daniel Spiewak
That would have been a *balanced* or *self-balancing* binary tree. In my view, a binary tree is the simplistic one.
paxdiablo
I agree, but I think the information is worth presenting since the intuitive analysis can often be wrong if balanced trees are not well understood.
Daniel Spiewak
+6  A: 

Might want to tag this one as "homework". Here's a good starting point: http://en.wikipedia.org/wiki/Binary_search_tree

In general, a balanced binary search tree has a worst-case lookup of O(log n), best case of O(1) (when the desired value is the root) and an average case of O(log n) (the leaves contain exponentially more values than their parents).

The worst case is the most interesting and is easily seen by recognizing that the first level of a binary tree has 1 node, the second has 2, the third has 4 and so on. Thus, the number of nodes in a binary tree of depth n is precisely 2^n - 1. The mathematical inverse of the exponential function is the logarithm, thus: O(log n).

An unbalanced tree can be as bad as a linked list and may have a shape like the following:

  1
 / \
    2
   / \
      3
     / \
        4
       / \

In this situation, the worst-case access time is O(n).

Daniel Spiewak
An unbalanced tree is any tree that's not perfectly balanced (i.e., depth of one subtree is different to another subtree). What you're referring to (the linked list) is a degenerate tree.
paxdiablo
Changing my answer to be slightly more accurate. In general, if you do not specify "balanced", then the worst case is O(n), regardless of whether it is *actually* degenerate.
Daniel Spiewak