views:

354

answers:

6

How can I write an algorithm to check if the sum of any two numbers in an array/list matches a given number with a complexity of nlogn?

+8  A: 

I'm sure there's a better way, but here's an idea:

  1. Sort array
  2. For every element e in the array, binary search for the complement (sum - e)

Both these operations are O(n log n).

Andreas Brinck
Also, after sorting, discard all the numbers which are bigger than sum. I assume that you have all positive integers.
Jack
yeah that's what i though and then i got all busy trying to compute the complexity it never occured to me that both the operations are o(nlogn)
Bunny Rabbit
You don't have to search for every elements actually. Since you've sorted the array, stop searching at `sum/2` (doesn't impact the big O, but still :p)
Matthieu M.
+4  A: 

Use a hash table. Insert every number into your hash table, along with its index. Then, let S be your desired sum. For every number array[i] in your initial array, see if S - array[i] exists in your hash table with an index different than i.

Average case is O(n), worst case is O(n^2), so use the binary search solution if you're afraid of the worst case.

IVlad
I don't get the worst case. Using a hash table should get you `O(1)` on insertion and look-up (with some care) or do you account for potential hash conflicts ? Anyway this seems even better than using binary search.
Matthieu M.
Yes, there could be hash conflicts. For example, consider the array `{1, 1, 1, ..., 1, 1, 1, ...}` that contains `n` ones. We want to make the sum 20. You would add all the ones in the same position in the hash table. Then, for each 1 you would check if 19 is in the array. Let's assume that the value 19 maps to the same location in your hash table that the value 1 does. This means that for each of the `n` ones you will do `n` checks in your hash table, giving you quadratic time complexity. Similar examples can be found for the case where there is a solution (use 3 values - one dominant)
IVlad
you could map the frequency as the value for each associated number. so if you have 8 1s, the hash will map `1 => 8`. When a sum of 2 is desired, for example, you put an additional check - if the current number equals the target number `(S - array[i])`, then it's frequency must be at least 2.
Anurag
+2  A: 

Let us say that we want to find two numbers in the array A that when added together equal N.

  1. Sort the array.
  2. Find the largest number in the array that is less than N/2. Set the index of that number as lower.
  3. Initialize upper to be lower + 1.
  4. Set sum = A[lower] + A[upper].
  5. If sum == N, done.
  6. If sum < N, increment upper.
  7. If sum > N, decrement lower.
  8. If either lower or upper is outside the array, done without any matches.
  9. Go back to 4.

The sort can be done in O(n log n). The search is done in linear time.

Matthew T. Staebler
+1  A: 
def sum_in(numbers, sum_):
    """whether any two numbers from `numbers` form `sum_`."""
    a = set(numbers) # O(n)
    return any((sum_ - n) in a for n in a) # O(n)

Example:

>>> sum_in([200, -10, -100], 100)
True
J.F. Sebastian
Note: this solution allows repetition i.e., it doesn't distinguish `[1]` and `[1, 1]` cases.
J.F. Sebastian
+1  A: 

Here's a try in C. This isn't marked homework.

// Assumes a sorted integer array with no duplicates
void printMatching(int array[], int size, int sum)
{
   int i = 0, k = size - 1;
   int curSum;
   while(i < k)
   {
      curSum = array[i] + array[k];
      if(curSum == sum)
      {
         printf("Found match at indices %d, %d\n", i, k);
         i++;k--;
      }
      else if(curSum < sum)
      {
         i++;
      }
      else
      {
         k--;
      }
   }
}

Here is some test output using int a[] = { 3, 5, 6, 7, 8, 9, 13, 15, 17 };

Searching for 12..
Found match at indices 0, 5
Found match at indices 1, 3
Searching for 22...
Found match at indices 1, 8
Found match at indices 3, 7
Found match at indices 5, 6
Searching for 4..
Searching for 50..

The search is linear, so O(n). The sort that takes place behind the scenes is going to be O(n*logn) if you use one of the good sorts.

Because of the math behind Big-O, the smaller term in additive terms will effectively drop out of your calculation, and you end up with O(n logn).

Phil
A: 

This can be done in O(n) using a hash table. Initialize the table with all numbers in the array, with number as the key, and frequency as the value. Walk through each number in the array, and see if (sum - number) exists in the table. If it does, you have a match. After you've iterated through all numbers in the array, you should have a list of all pairs that sum up to the desired number.

array = initial array
table = hash(array)
S = sum

for each n in array
    if table[S-n] exists
        print "found numbers" n, S-n

The case where n and table[S-n] refer to the same number twice can be dealt with an extra check, but the complexity remains O(n).

Anurag