Finding the dominating pairs in a set of coordinates in O(nlgn)

views:

120

answers:

Finding the dominating pairs in a set of coordinates in O(nlgn)

a coordinate a(xa,ya) dominates b(xb,yb) if ( xa>=xb and ya>=yb) how can I find all pairs in a set of coordinates in nlgn using divide and conquer?

edit:the number of pairs instead.

Do a quick sort where your comparison is first to sort by X then Y ( so you'd get something like 5,3 5,2 4,7 4,2 etc. Quicksort is nlogn

Then just iterate from the highest point down doing your compare. That would be at most O(n). You end up with O(n) + O(nlogn) => O(nlogn)

Quicksort uses divide and conquer - it divides on the pivot.

EDIT:

Another thing I considered. You can walk the entire set and put all the points that are dominated in the X coordinate by your point in a set. Then, walk that smaller subset and filter out the ones that are also dominated by your Y. This is just two walks, for O(n) performance.

Anon 2010-06-27 02:29:11

That will find one pair in `O(n lg n)`, but he wants to find all pairs.

danben 2010-06-27 02:33:24

but to find all pairs this algorithm would take O(N^2)

ryanxu 2010-06-27 02:37:28

O(n logn) is quicksort average. Worst case is O(n^2), So I believe you can't sort here

belisarius 2010-06-27 03:07:50

You could instead use a different sorting algorithm like merge sort which is O(n log n) in both average and worst cases. (But quicksort is usually faster and you almost never end up with the worst case scenario.)

Ben Alpert 2010-06-27 03:11:31

You should a STABLE QuickSort.

Fabio F. 2010-06-27 09:12:25

@Fabio: QuickSort is not a stable algorithm in efficient implementations. If you need stability, you should usually use Merge Sort. There's no need for a stable sort here in any case.

Billy ONeal 2010-06-27 23:02:28

+1 A:

Roughly speaking, any given vector (xa,ya) will dominate or be dominated by about half of the other vectors (ya,yb), because among the four cases for {xa <=> ya, xb <=>yb}, two are cases of dominance.

So we expect the solution to your problem to comprise about n*(n/2) pairs of vectors. The algorithm can't be cheaper than its solution, so n*ln(n) is not going to work.

Josephine 2010-06-27 02:57:13

This is what I found,but I dont quite understand it.http://www.sable.mcgill.ca/~nngman/comp507/ especially "If we choose to sort all the points in their y-coordinates, then binary search can be used to identify all points in A that are dominated in just O(log N) time for each point in B. "

ryanxu 2010-06-27 03:03:22

@ryanxu: The page you linked to is about finding the _number_ of points dominated by each point and able to solve range queries efficiently. What you said in the question is quite different.

Moron 2010-06-27 03:17:34

If I want to find the number of pairs, how would i do it in nlgn

ryanxu 2010-06-27 03:25:30

@ryanxu: Use the algorithm in the page you linked, and add up the values for each point.

Moron 2010-06-27 03:41:30

@Moron, is there a pseudo code for it?

ryanxu 2010-06-27 03:57:27

@Moron: I am wondering how they did this step"If we choose to sort all the points in their y-coordinates, then binary search can be used to identify all points in A that are dominated in just O(log N) time for each point in B. " I am not sure how they do this. To sort all the y-coordinates will take nlgn time each time, for the a side

ryanxu 2010-06-27 04:58:32

@ryanxu: You only need to sort once.

Moron 2010-06-27 08:16:31

@Moron, the size of the A is always changing right. So dont you have to sort it for every single split?

ryanxu 2010-06-27 18:25:46

@ryanxu. No, you can sort all of the points by y in O(nlogn) time before even splitting. While splitting you can use another array which contains the points of A in sorted by y order and use that for B.

Moron 2010-06-27 20:27:13

lets say you have the points such that

a1 > a2 > a3 ... > an-1 > an d (read > as dominates) n is proporional to the number of points (N) in all situations.

the number of pairs itself is greater than nlogn (such as (A[1],A[2..n]) (A2], A[3..n]) ..A[n=1], A[n]) I think that is n (n-1) / 2.

so N logN doesn't seem to be possible

neal aise 2010-06-27 03:12:31

Assuming that you only need to count the number of pairs, you can take the sweeping approach:

1) Sort the points according to their X values

2) Collect the points into a search tree.

Here we use a balanced tree based on the Y values of the points. The tree should have a counter per internal node, indicating the number of items in the subtree rooted by it. These counters can be maintained without any impact on the time complexity of the tree operations. The usage of counters allows querying the number of items lower than a given value V, in logarithmic time.

More details of (2): we scan the points obtained in step (1) from left to right. For each point P traversed, we add P to the tree, and then compute the number of items with Y < P.Y. This number is added to the global count which is returned at the end.

Step (1) runs in N*Log(N) time, and step (2) performs N iterations of two Log(N) operations, therefore it has the same complexity.

Eyal Schneider 2010-06-27 08:33:45

ansaurus

tags:

views:

answers:

Finding the dominating pairs in a set of coordinates in O(nlgn)

related questions