tags:

views:

587

answers:

3

Here are two integers set, say A and B and we can get another set C, in which every element is sum of element a in A and element b in B.

For example, A = {1,2}, B = {3,4} and we get C = {4, 5, 6} where 4=1+3, 5=1+4=2+3, 6=2+4

Now I want to find out which number is the kth largest one in set C, for example 5 is 2nd largest one in above example.

Is there a efficient solution?

I know that pairwise sums sorting is an open problem and has a n^2 lower time bound. But since only kth largest number is needed, maybe we can learn from the O(n) algorithm of finding median number in an unsorted array.

Thanks.

A: 

Well, O(n) would be a lower bound (probably not tight though), otherwise you could run the O(n) algorithm n times to get a sorted list in O(n^2).

Can you assume the two sets are sorted (you present them in sorted order above)? If so, you could possibly get something with an average case that's decently better by doing an "early out", starting at the last pair of elements, etc. Just a hunch though.

Chris Simmons
+2  A: 

If k is very close to 1 or N, any algorithm that generates the sorted sums lazily could simply be run until the kth or N-kth item pops out.

In particular, I'm thinking of best-first search of the following space: (a,b) means the ath item from A, the first list, added to the bth from B, the second.

Keep in a best=lowest priority queue pairs (a,b) with cost(a,b) = A[a]+B[b].

Start with just (1,1) in the priority queue, which is the minimum.

Repeat until k items popped: 
 pop the top (a,b)
 if a<|A|, push (a+1,b)
 if a=1 and b<|B|, push (a,b+1)

This gives you a saw-tooth comb connectivity and saves you from having to mark each (a,b) visited in an array. Note that cost(a+1,b)>=cost(a,b) and cost(a,b+1)>=cost(a,b) because A and B are sorted.

Here's a picture of a comb to show the successor generation rule above (you start in the upper left corner; a is the horizontal direction):

|-------
|-------
|-------

It's just best-first exploration of (up to) all |A|*|B| tuples and their sums.

Note that the most possible items pushed before popping k is 2*k, because each item has either 1 or 2 successors. Here's a possible queue state, where items pushed into the queue are marked *:

|--*----
|-*-----
*-------

Everything above and to the left of the * frontier has already been popped.

For the N-k<k case, do the same thing but with reversed priority queue order and exploration order (or, just negate and reverse the values, get the (N-k)th least, then negate and return the answer).

See also: sorted list of pairwise sums on SO, or the Open problems project.

wrang-wrang
Yes, this is exactly a nice solution. In such a case, if we have sorted A and B, we can get kth largest with O(min(klgk, klgn)), right?
ibread
+2  A: 

Sort arrays A & B : O(mlogm + nlogn) Apply a modified form of algorithm for merging 2 sorted arrays : O(m+n) i.e. at each point, u sum the the two elements. When u have got (m+n-k+1)th element in C, stop merging. That element is essentially kth largest. E.g. {1,2} & {3,4} : Sorted C: {1+3,(1+4)|(2+3),2+4}

avd