views:

86

answers:

1

Hi everyone. If I am given a certain set of numbers (which I store in a balanced binary search tree for easiness), then I want to answer a query that requires me to inform what is the ith smallest number between [A,B], what would be a fast algorithm to perform that task?

Technically I could traverse the tree from the root searching for A (or a number immediately greater than that if A not present), than backtrack searching for B (or a number smaller than B), and while doing that I could keep a counter for i, to determine when I would be at the ith number. But that does not seem optimal to me.

Can I perform this operation on O(log n), height of the tree that I'm using to store the universal set of numbers?

Thank you

+5  A: 

You certainly can do that, if you can keep some information in nodes. To make traversal O(logn), we need for each node to know, how many nodes its right subtree has. (You can maintain this information and still have adding to/removing from tree in O(log(n)) time)

def search(currentNode, k)
    if k == 0 then
        return currentNode
    else if currentNode.rightBranchSize >= k then
        // we remove 1 from k because currentNode isn't considered anymore
        return search(currentNode.right, k - 1)
    else
        // we decrease k because currentNode and its whole right subtree aren't considered anymore
        return search(currentNode.parent, k - currentNode.rightBranchSize - 1)
    end
end

The code should be quite self-explanatory.
We start from currentNode being the first node with number >= A and look for k-th element (k is 0-based).

  1. As schnaader in his comment, it would be easier to just traverse tree if k is small.
  2. Most wide-spread libraries (like STL) won't allow you traverse tree in such way (they don't provide any kind of Node struct with pointers to left and right subtrees). So, although algorithm is simple, implementing it might be difficult.
  3. It takes only little modification to consider B from your question.
Nikita Rybak
Hi Nikkita, thanks for the reply. I actually had the algorithm you described in mind. But my thought was that, if I first run a search to find A (and determine my currentNode, using your nomenclature), and I then backtrack my tree, checking the number of elements on the subtrees while searching for the ith number in the interval between A and B(or to determine it does not exist), wouldn't I have to perform a few more operations, and by doing so, my complexity would be greater than O(log n), since that itself was the time it took for me to search A? Thanks a lot ;)
AlexTex
@AlexTex Finding first number >= A takes exactly O(logn) time in binary tree. The algorithm I described also takes O(logn) time. And O(logn + logn) == O(logn), so whole algorithm takes O(logn) time. (assuming you have subtree sizes calculated already)
Nikita Rybak
@Nikkita Rybak Good point ;)
AlexTex