tags:

views:

394

answers:

4

Given two sorted lists, each containing n real numbers, is there a O(log n) time algorithm to compute the element of rank i (where i coresponds to index in increasing order) in the union of the two lists, assuming the elements of the two lists are distinct?

EDIT: @BEN: This i s what I have been doing , but I am still not getting it.

I have an examples ;

List A : 1, 3, 5, 7 List B : 2, 4, 6, 8

Find rank(i) = 4.

First Step : i/2 = 2; List A now contains is A: 1, 3 List B now contains is B: 2, 4

         compare A[i] to B[i] i.e 

                 A[i] is less;

                 So the lists now become :

                   A: 3 
                   B: 2,4

Second Step: i/2 = 1

         List A now contains A:3
         List B now contains B:2 

         NoW I HAVE LOST THE VALUE 4 which is actually the result ...

I know I am missing some thing , but even after close to a day of thinking I cant just figure this one out...

A: 

When merging two lists, you're going to have to touch every element in both lists. If you don't touch every element, some elements will be left behind. Thus your theoretical lower bound is O(n). So you can't do it that way.

You don't have to sort, since you have two lists that are already sorted, and you can maintain that ordering as part of the merge.

jeffamaphone
You don't have to complete the merge, so it's `O(i)` not `O(n)`. But you can avoid the merge completely.
Ben Voigt
This is completely true.
jeffamaphone
A: 

edit: oops, I misread the question. I thought given value, you want to find rank, not the other way around. If you want to find rank given value, then this is how to do it in O(log N):

Yes, you can do this in O(log N), if the list allows O(1) random access (i.e. it's an array and not a linked list).

  • Binary search on L1
  • Binary search on L2
  • Sum the indices

You'd have to work out the math, +1, -1, what to do if element isn't found, etc, but that's the idea.

polygenelubricants
I gathered from the question that he was trying to find the value at index i, so he was given an index first, not a value to search for
Phil
You have it backward (the problem is to find the value given the index, not find the index given the value), but use of binary search is correct.
Ben Voigt
@Ben, @Phil: you're right, I misread. I'm still keeping up my answer for learning purposes.
polygenelubricants
+5  A: 

Yes:

You know the element lies within either index [0,i] of the first list or [0,i] of the second list. Take element i/2 from each list and compare. Proceed by bisection.

I'm not including any code because this problem sounds a lot like homework.

EDIT: Bisection is the method behind binary search. It works like this:

Assume i = 10; (zero-based indexing, we're looking for the 11th element overall).

On the first step, you know the answer is either in list1(0...10) or list2(0...10). Take a = list1(5) and b = list2(5).

If a > b, then there are 5 elements in list1 which come before a, and at least 6 elements in list2 which come before a. So a is an upper bound on the result. Likewise there are 5 elements in list2 which come before b and less than 6 elements in list1 which come before b. So b is an lower bound on the result. Now we know that the result is either in list1(0..5) or list2(5..10). If a < b, then the result is either in list1(5..10) or list2(0..5). And if a == b we have our answer (but the problem said the elements were distinct, therefore a != b).

We just repeat this process, cutting the size of the search space in half at each step. Bisection refers to the fact that we choose the middle element (bisector) out of the range we know includes the result.

So the only difference between this and binary search is that in binary search we compare to a value we're looking for, but here we compare to a value from the other list.

NOTE: this is actually O(log i) which is better (at least no worse than) than O(log n). Furthermore, for small i (perhaps i < 100), it would actually be fewer operations to merge the first i elements (linear search instead of bisection) because that is so much simpler. When you add in cache behavior and data locality, the linear search may well be faster for i up to several thousand.

Also, if i > n then rely on the fact that the result has to be toward the end of either list, your initial candidate range in each list is from ((i-n)..n)

Ben Voigt
+1! And of course the list has to allow `O(1)` random access, but yes, binary search is the key.
polygenelubricants
Hey , Could you guide me as to how to proceed to do bisection. I am new to it and most of the materials I read on the net relates to finding the root of complex numbers. Not sure how to proceed towards using bisection or how it exactly works.
Eternal Learner
reason for downvote?
Ben Voigt
+1  A: 

Here is how you do it.

Let the first list be ListX and the second list be ListY. We need to find the right combination of ListX[x] and ListY[y] where x + y = i. Since x, y, i are natural numbers we can immediately constrain our problem domain to x*y. And by using the equations max(x) = len(ListX) and max(y) = len(ListY) we now have a subset of x*y elements in the form [x, y] that we need to search.

What you will do is order those elements like so [i - max(y), max(y)], [i - max(y) + 1, max(y) - 1], ... , [max(x), i - max(x)]. You will then bisect this list by choosing the middle [x, y] combination. Since the lists are ordered and distinct you can test ListX[x] < ListY[y]. If true then we bisect the upper half our [x, y] combinations or if false then we bisect the lower half. You will keep bisecting until find the right combination.

There are a lot of details I left, but that is the general gist of it. It is indeed O(log(n))!

Edit: As Ben pointed out this actually O(log(i)). If we let n = len(ListX) + len(ListY) then we know that i <= n.

Brian Gideon