How can I write an algorithm to check if the sum of any two numbers in an array/list matches a given number
with a complexity of nlogn
?
views:
354answers:
6I'm sure there's a better way, but here's an idea:
- Sort array
- For every element e in the array, binary search for the complement (sum - e)
Both these operations are O(n log n)
.
Use a hash table. Insert every number into your hash table, along with its index. Then, let S
be your desired sum. For every number array[i]
in your initial array, see if S - array[i]
exists in your hash table with an index different than i
.
Average case is O(n)
, worst case is O(n^2)
, so use the binary search solution if you're afraid of the worst case.
Let us say that we want to find two numbers in the array A that when added together equal N.
- Sort the array.
- Find the largest number in the array that is less than N/2. Set the index of that number as lower.
- Initialize upper to be lower + 1.
- Set sum = A[lower] + A[upper].
- If sum == N, done.
- If sum < N, increment upper.
- If sum > N, decrement lower.
- If either lower or upper is outside the array, done without any matches.
- Go back to 4.
The sort can be done in O(n log n). The search is done in linear time.
def sum_in(numbers, sum_):
"""whether any two numbers from `numbers` form `sum_`."""
a = set(numbers) # O(n)
return any((sum_ - n) in a for n in a) # O(n)
Example:
>>> sum_in([200, -10, -100], 100)
True
Here's a try in C. This isn't marked homework.
// Assumes a sorted integer array with no duplicates
void printMatching(int array[], int size, int sum)
{
int i = 0, k = size - 1;
int curSum;
while(i < k)
{
curSum = array[i] + array[k];
if(curSum == sum)
{
printf("Found match at indices %d, %d\n", i, k);
i++;k--;
}
else if(curSum < sum)
{
i++;
}
else
{
k--;
}
}
}
Here is some test output using int a[] = { 3, 5, 6, 7, 8, 9, 13, 15, 17 };
Searching for 12..
Found match at indices 0, 5
Found match at indices 1, 3
Searching for 22...
Found match at indices 1, 8
Found match at indices 3, 7
Found match at indices 5, 6
Searching for 4..
Searching for 50..
The search is linear, so O(n). The sort that takes place behind the scenes is going to be O(n*logn) if you use one of the good sorts.
Because of the math behind Big-O, the smaller term in additive terms will effectively drop out of your calculation, and you end up with O(n logn).
This can be done in O(n)
using a hash table. Initialize the table with all numbers in the array, with number as the key, and frequency as the value. Walk through each number in the array, and see if (sum - number)
exists in the table. If it does, you have a match. After you've iterated through all numbers in the array, you should have a list of all pairs that sum up to the desired number.
array = initial array
table = hash(array)
S = sum
for each n in array
if table[S-n] exists
print "found numbers" n, S-n
The case where n and table[S-n] refer to the same number twice can be dealt with an extra check, but the complexity remains O(n)
.