tags:

views:

85

answers:

1

Assume that we have a sequence S with n elements <x1,x2,...,xn>. A pair of elements xi,xj are considered neighbors if |xi-xj|<d, with d a given distance between two neighbors. So how can find out the element that has most neighbors in the sequence?

(A simple way is sorting the sequence and then calculating number of each element but it's time complexity is quite large):

O(nlogn)

May you please help me find a better way to reduce time complexity?

+2  A: 

O(N log N) isn't "quite large"; log is a very slow growing function!

This can not be done in time less than O(N) since you have to read the whole input. You can do it in O(N) if you know the range of the numbers, in which case you can use any of O(N) sorting methods (e.g. counting sort/bucket sort, radix sort).


Here's a O(dN)-time, O(N + k)-space solution based on counting sort, implemented in Java.

static int mostNeighbors(int k, int d, int... nums) {
    boolean[] appears = new boolean[k];
    int[] neighborCount = new int[k];
    int most = 0;
    for (int num : nums) {
        appears[num] = true;
        for (int i = num - d + 1; i < num + d; i++) {
            if (i < 0 || i >= k) continue;
            if (++neighborCount[i] > neighborCount[most] && appears[i]) {
                most = i;
            }
        }
    }
    return most;
}

// ...
System.out.println(mostNeighbors(10, 2, 0,0,0,2,2,2)); // prints 0
System.out.println(mostNeighbors(10, 2, 0,0,0,2,2,2,1)); // prints 1
System.out.println(mostNeighbors(10, 2, 1,2,3,4,5,3)); // prints 2

Basically for each num, it increments neighborCount[num-d+1..num+d-1] that are in range of 0..k-1. Not all of the numbers in this range actually appears in the array, so appears is used to keep track of that. most is used to keep track of the best found so far.

polygenelubricants
As I want to find out the most dense area in a range of value on a large dataset, so if it can be reduced more?
Bao An