ansaurus

Question

Design an algorithm to find all pairs of integers within an array which sum to a specified value ?

Answer 1

+2 A:

How about sorting the array, then marching in from both ends?

Beta 2009-09-29 18:19:31

That's exactly where I was heading. Keeps number of operations to a minimum.

Bomlin 2009-09-29 18:21:43

The sorting of the array itself can furher be optimized, by keeping a tab of the smallest and biggest number during the initial pass and calculating a maximum (and possibly a minimum) threshold that can be used to eliminate numbers that are too big (and possibly too small) during the next iteration.

mjv 2009-09-29 18:39:58

You can discard numbers that are too big only if you assume that the integers are non-negative. And how can you discard a number for being too small?

Beta 2009-09-29 20:12:25

I agree with @Beta. It gets too complicated if you have negative numbers. Which was not mentioned in the question. That's why I resorted to a full sort and binary search.

Ayman 2009-09-30 04:22:01

Answer 2

+2 A:

Assume required sum = R

sort the array
for each number in the array A(n), do a binary search to find the number A(x) such that A(n) + A(x) = R

Ayman 2009-09-29 18:22:06

No need to search - you can do this much more efficiently than that.

Nick Johnson 2009-09-29 21:30:25

Efficiently in what sense? The sort is O(n log n), the cost of n binary searches is only O(n log n). Optimizing the searches is therefore pretty pointless. Do you mean you have an answer better than O(n log n) overall?

chrispy 2009-09-29 21:47:59

Isn't binary search O(log n)?

Ayman 2009-09-30 04:19:18

If the array is already sorted, your algorithm is O( n log n ), but O(n) is possible.

leiz 2009-09-30 05:48:28

Indeed, the sort is O(n log n), but the search can be O(n) by moving in from each end of the array. The overall complexity is still O(n log n), but that doesn't mean that the latter solution isn't strictly more efficient.

Nick Johnson 2009-09-30 12:02:15

Answer 3

A:

I don't see why the hash table approach is inefficient, at least in algorithm analysis terms - in memory locality terms admittedly, it can be quite bad. Anyway, scan the array twice...

First scan - put all the array elements in the hash table - amortized O(n) total.

Second scan - check for (sum - current) in the hash table - O(n) total.

This beats the O(n log n) sort-and-search methods, at least in theory.

Then, note that you can combine the two scans into one. You can spot a pair as soon as you encounter the second of that pair during the first scan. In pseudocode...

for i in array.range
  hashset.insert (array [i])

  diff = sum - array [i]
  if hashset.includes (diff)
    output diff, array [i]

If you need positions of the items, use a hashmap and store item positions in it. If you need to cope with duplicates, you might need to store counts in a hashmap. For positions and duplicates, you might need a hashmap of start pointers for linked lists of positions.

This makes assumptions about the hash table implementation, but fairly safe ones given the usual implementations in most current languages and libraries.

BTW - combining the scans shouldn't be seen as an optimisation. The iteration overhead should be insignificant. Memory locality issues could make a single pass slightly more efficient for very large arrays, but the real memory locality issues will be in the hashtable lookups anyway.

IMO the only real reason to combine the scans is because you only want each pair reported once - handling that in a two-scan approach would be a bit more hassle.

Steve314 2009-10-02 02:16:37

Answer 4

A:

If you don't mind spending O(M) in space, where M is the sum you are seeking, you can do this in O(N + M) time. Set sums[i] = 1 when i <= M on a single pass over N, then check (sums[i] && sums[M-i]) on a single pass over M/2.

Jeff Williams 2010-09-21 20:36:56

ansaurus

tags:

views:

answers:

Design an algorithm to find all pairs of integers within an array which sum to a specified value ?

related questions