ansaurus

Question

Programming Interview Question / how to find if any two integers in an array sum to zero?

Answer 1

+9 A:

Use a lookup table: Scan through the array, inserting all positive values into the table. If you encounter a negative value of the same magnitude (which you can easily lookup in the table); the sum of them will be zero. The lookup table can be a hashtable to conserve memory.

This solution should be O(N).

Pseudo code:

var table = new HashSet<int>();
var array = // your int array
foreach(int n in array)
{
     if ( !table.Contains(n) ) 
         table.Add(n);
     if ( table.Contains(n*-1) )
         // You found it.;
}

driis 2010-06-29 08:35:49

That's only O(n) if the hash table lookup is O(1) - that's unlikely to be the case for arbitrary integers.

paxdiablo 2010-06-29 08:39:30

Also, this will fail if the negative precedes the positive.

walkytalky 2010-06-29 08:48:27

@walkytalky, good point, I edited the example, so it does not fail.

driis 2010-06-29 09:00:44

Hashtable lookup _is_ O(1).

Nick Johnson 2010-06-29 09:24:52

@Nick: That really depends on the details, but for "small" integers it looks very similar to that.

Donal Fellows 2010-06-29 09:27:07

This will report a single zero in the array as a pair, as the second "Contains" query will find the entry for the zero added in the preceeding line. Edge case, I know.

Neil Moss 2010-06-29 09:42:14

Hashtable lookup should be somewhat close to O(n) for evenly distributed integers - I do know that is not always the case. If there is some kind of known constraint on the size of the integers, you can use an index-based lookup table; to avoid the hashtable.

driis 2010-06-29 10:50:16

Hash table lookup is _not_ O(1) unless you can guarantee one entry per bucket with a perfect hash function. Wost case is O(n). You only have to think about it: if you have 100 buckets and 100 items, the best you can do is one per bucket. With 200 items, the best is two per bucket. With one million items, it's 10,000 per bucket, and all those figures require a very good hashing function. Now the problem of hash collisions can be mitigated with each bucket holding a binary tree or even another hash but that in no way makes the worst case O(1).

paxdiablo 2010-06-29 12:42:19

@paxdiablo Any reasonable implementation of a hashtable expands the table as the number of items increases. The bounds on the density of different hashtable variants at which the expected lookup time is O(1) are well known and understood. What you say goes only for pathologically badly built hashtables.

Nick Johnson 2010-06-29 13:31:00

Assuming you have enough memory (may not be a good assumption), youcan use a bit map which will give you O(1) lookup (example in Programming Pearls for a sorting a fixed range of numbers)

hackworks 2010-06-30 05:51:30

Answer 2

A:

The sum of two integers can only be zero if one is the negative of the other, like 7 and -7, or 2 and -2.

Frank 2010-06-29 08:36:06

-1: first of all so what, how does this help? second, `0 + 0 = 0`.

IVlad 2010-06-29 08:38:17

Firstly, it helps finding an approach to solve the issue (but I admit it is no complete solution).And, is 0 not the negative of 0?

Frank 2010-06-29 08:43:20

True, `-0 = 0`. You should still explain your idea better in my opinion.

IVlad 2010-06-29 08:50:28

Answer 3

A:

Maybe stick each number in a hash table, and if you see a negative one check for a collision? O(n). Are you sure the question isn't to find if ANY sum of elements in the array is equal to 0?

Meekohi 2010-06-29 08:36:17

Answer 4

+8 A:

The hashtable solution others have mentioned is usually O(n), but it can also degenerate to O(n^2) in theory.

Here's a Theta(n log n) solution that never degenerates:

Sort the array (optimal quicksort, heap sort, merge sort are all Theta(n log n))
for i = 1, array.len - 1
    binary search for -array[i] in i+1, array.len

If your binary search ever returns true, then you can stop the algorithm and you have a solution.

IVlad 2010-06-29 08:41:34

Can't quicksort degenerate to O(n^2) in the worst case though?

Rup 2010-06-29 09:16:10

@Rup: it depends on the pivot selection algorithm. You can have a quicksort that has `O(n log n)` worst case performance with theoretically better pivot selection (I believe this is what IVlad mean by "optimal quicksort"). I believe it's rarely implemented since the "dumb" algorithm is faster and is good enough in practice.

polygenelubricants 2010-06-29 09:25:15

Try using mergesort instead; that's guaranteed O(n log n).

Donal Fellows 2010-06-29 09:29:02

Answer 5

+3 A:

An O(n log n) solution (i.e., the sort) would be to sort all the data values then run a pointer from lowest to highest at the same time you run a pointer from highest to lowest:

def findmatch(array n):
    lo = first_index_of(n)
    hi = last_index_of(n)
    while true:
        if lo >= hi:                   # Catch where pointers have met.
            return false
        if n[lo] = -n[hi]:             # Catch the match.
            return true
        if sign(n[lo]) = sign(n[hi]):  # Catch where pointers are now same sign.
            return false
        if -n[lo] > n[hi]:             # Move relevant pointer.
            lo = lo + 1
        else:
            hi = hi - 1

An O(n) time complexity solution is to maintain an array of all values met:

def findmatch(array n):
    maxval = maximum_value_in(n)         # This is O(n).
    array b = new array(0..maxval)       # This is O(1).
    zero_all(b)                          # This is O(n).
    for i in index(n):                   # This is O(n).
        if n[i] = 0:
            if b[0] = 1:
                return true
            b[0] = 1
            nextfor

        if n[i] < 0:
            if -n[i] <= maxval:
                if b[-n[i]] = 1:
                    return true;
                b[-n[i]] = -1
            nextfor

        if b[n[i]] = -1:
            return true;
        b[n[i]] = 1

This works by simply maintaining a sign for a given magnitude, every possible magnitude between 0 and the maximum value.

So, if at any point we find -12, we set b[12] to -1. Then later, if we find 12, we know we have a pair. Same for finding the positive first except we set the sign to 1. If we find two -12's in a row, that still sets b[12] to -1, waiting for a 12 to offset it.

The only special cases in this code are:

0 is treated specially since we need to detect it despite its somewhat strange properties in this algorithm (I treat it specially so as to not complicate the positive and negative cases).
low negative values whose magnitude is higher than the highest positive value can be safely ignored since no match is possible.

As with most tricky "minimise-time-complexity" algorithms, this one has a trade-off in that it may have a higher space complexity (such as when there's only one element in the array that happens to be positive two billion).

In that case, you would probably revert to the sorting O(n log n) solution but, if you know the limits up front (say if you're restricting the integers to the range [-100,100]), this can be a powerful optimisation.

In retrospect, perhaps a cleaner-looking solution may have been:

def findmatch(array num):
    # Array empty means no match possible.
    if num.size = 0:
        return false

    # Find biggest value, no match possible if empty.
    max_positive = num[0]
    for i = 1 to num.size - 1:
        if num[i] > max_positive:
            max_positive = num[i]
    if max_positive < 0:
        return false

    # Create and init array of positives.
    array found = new array[max_positive+1]
    for i = 1 to found.size - 1:
        found[i] = false
    zero_found = false

    # Check every value.
    for i = 0 to num.size - 1:
        # More than one zero means match is found.
        if num[i] = 0:
            if zero_found:
                return true
            zero_found = true

        # Otherwise store fact that you found positive.
        if num[i] > 0:
            found[num[i]] = true

    # Check every value again.
    for i = 0 to num.size - 1:
        # If negative and within positive range and positive was found, it's a match.
        if num[i] < 0 and -num[i] <= max_positive:
            if found[-num[i]]:
                return true

    # No matches found, return false.
    return false

This makes one full pass and a partial pass (or full on no match) whereas the original made the partial pass only but I think it's easier to read and only needs one bit per number (positive found or not found) rather than two (none, positive or negative found). In any case, it's still very much O(n) time complexity.

paxdiablo 2010-06-29 08:43:30

I think the second condition is out of place. In case of two adjacent zeroes representing the solution, you would return `false`. It should go after the third condition.

IVlad 2010-06-29 08:52:49

Good catch! Changed. Zero caused me all sorts of troubles with the O(n) solution as well :-)

paxdiablo 2010-06-29 09:10:20

Isn't this essentially the same as the hash solution, though, just picking the size of your hash table sufficiently large?

Rup 2010-06-29 09:18:30

Sort of, but it gives you a perfect hash function which generalised hashing algorithms may not.

paxdiablo 2010-06-29 09:21:46

Answer 6

A:

Given a sorted array you can find number pairs (-n and +n) by using two pointers:

the first pointer moves forward (over the negative numbers),
the second pointer moves backwards (over the positive numbers),
depending on the values the pointers point at you move one of the pointers (the one where the absolute value is larger)
you stop as soon as the pointers meet or one passed 0
same values (one negative, one possitive or both null) are a match.

Now, this is O(n), but sorting (if neccessary) is O(n*log(n)).

EDIT: example code (C#)

// sorted array
var numbers = new[]
{
    -5, -3, -1, 0, 0, 0, 1, 2, 4, 5, 7, 10 , 12
};

var npointer = 0;   // pointer to negative numbers
var ppointer = numbers.Length - 1;  // pointer to positive numbers

while( npointer < ppointer )
{
    var nnumber = numbers[npointer];
    var pnumber = numbers[ppointer];

    // each pointer scans only its number range (neg or pos)
    if( nnumber > 0 || pnumber < 0 )
    {
        break;
    }

    // Do we have a match?
    if( nnumber + pnumber == 0 )
    {
        Debug.WriteLine( nnumber + " + " + pnumber );
    }

    // Adjust one pointer
    if( -nnumber > pnumber )
    {
        npointer++;
    }
    else
    {
        ppointer--;
    }
}

Interesting: we have 0, 0, 0 in the array. The algorithm will output two pairs. But in fact there are three pairs ... we need more specification what exactly should be output.

tanascius 2010-06-29 08:45:55

Answer 7

+1 A:

I think IVlad's answer is probably what you're after, but here's a slightly more off the wall approach.

If the integers are likely to be small and memory is not a constraint, then you can use a BitArray collection. This is a .NET class in System.Collections, though Microsoft's C++ has a bitset equivalent.

The BitArray class allocates a lump of memory, and fills it with zeroes. You can then 'get' and 'set' bits at a designated index, so you could call myBitArray.Set(18, true), which would set the bit at index 18 in the memory block (which then reads something like 00000000, 00000000, 00100000). The operation to set a bit is an O(1) operation.

So, assuming a 32 bit integer scope, and 1Gb of spare memory, you could do the following approach:

BitArray myPositives = new BitArray(int.MaxValue);
BitArray myNegatives = new BitArray(int.MaxValue);
bool pairIsFound = false;
for each (int testValue in arrayOfIntegers)
{
    if (testValue < 0)
    {
        // -ve number - have we seen the +ve yet?
        if (myPositives.get(-testValue))
        {
            pairIsFound = true;
            break;
        }
        // Not seen the +ve, so log that we've seen the -ve.
        myNegatives.set(-testValue, true);
    }
    else
    {
        // +ve number (inc. zero). Have we seen the -ve yet?
        if (myNegatives.get(testValue))
        {
            pairIsFound = true;
            break;
        }
        // Not seen the -ve, so log that we've seen the +ve.
        myPositives.set(testValue, true);
        if (testValue == 0)
        {
            myNegatives.set(0, true);
        }
    }
}

// query setting of pairIsFound to see if a pair totals to zero.

Now I'm no statistician, but I think this is an O(n) algorithm. There is no sorting required, and the longest duration scenario is when no pairs exist and the whole integer array is iterated through.

Well - it's different, but I think it's the fastest solution posted so far.

Comments?

Neil Moss 2010-06-29 09:28:16

It's the fastest in theory, but in practice who's to say you'll always be able to fit your numbers into 1 GB or even have 1 GB available? If you know the numbers are 32 bit ints and you can spare 1 GB, then this is indeed the fastest solution.

IVlad 2010-06-29 09:57:30

Regarding integer range and memory - agreed. It's a design-time question. I'd hope it would be a question that would be well received as a response at interview.

Neil Moss 2010-06-29 10:10:47

Answer 8

A:

Here's a nice mathematical way to do it: Keep in mind all prime numbers (i.e. construct an array prime[0 .. max(array)], where n is the length of the input array, so that prime[i] stands for the i-th prime.

counter = 1
for i in inputarray:
    if (i >= 0):
        counter = counter * prime[i]
for i in inputarray:
    if (i <= 0):
        if (counter % prime[-i] == 0):
            return "found"
return "not found"

However, the problem when it comes to implementation is that storing/multiplying prime numbers is in a traditional model just O(1), but if the array (i.e. n) is large enough, this model is inapropriate.

However, it is a theoretic algorithm that does the job.

phimuemue 2010-07-07 05:34:06

n must be the maximum absolute element value in the input array. And it is better to use a bitmap instead of an array of prime numbers.

Sergius 2010-07-07 06:03:42

All the solutions posted so far are *mathematical*. Personally I don't know why you would do this, you'd end up overflowing even 64 bits easily. The array doesn't have to be large enough, it's enough for the numbers in it to be large. And not even very large, 4 numbers > 1000 can already cause this method to overflow a 32 bit int.

IVlad 2010-07-07 07:32:23

Answer 9

A:

Here's a slight variation on IVlad's solution which I think is conceptually simpler, and also n log n but with fewer comparisons. The general idea is to start on both ends of the sorted array, and march the indices towards each other. At each step, only move the index whose array value is further from 0 -- in only Theta(n) comparisons, you'll know the answer.

sort the array (n log n)

loop, starting with i=0, j=n-1
  if a[i] == -a[j], then stop:
    if a[i] != 0 or i != j, report success, else failure
  if i >= j, then stop: report failure
  if abs(a[i]) > abs(a[j]) then i++ else j--

(Yeah, probably a bunch of corner cases in here I didn't think about. You can thank that pint of homebrew for that.)

e.g.,

[ -4, -3, -1, 0, 1, 2 ]     notes:
  ^i                ^j      a[i]!=a[j], i<j, abs(a[i])>abs(a[j])
      ^i            ^j      a[i]!=a[j], i<j, abs(a[i])>abs(a[j])
          ^i        ^j      a[i]!=a[j], i<j, abs(a[i])<abs(a[j])
          ^i     ^j         a[i]==a[j] -> done

Ken 2010-07-07 06:26:56

ansaurus

tags:

views:

answers:

Programming Interview Question / how to find if any two integers in an array sum to zero?

related questions