ansaurus

Question

Given a vector of maximum 10 000 natural and distinct numbers, find 4 numbers(a, b, c, d) such that a + b + c = d.

Answer 1

+7 A:

Solution in O(n² log n):

Compute sets of all possible sums and differences:

{a_i+a_j: 1 <= i,j <= n}

{a_i-a_j: 1 <= i,j <= n}

(store them in a balanced binary search tree) and check if they have a common element. If yes, there are i,j,k,l such that a_i + a_j = a_k - a_l, that is a_i+a_j+a_l=a_k.

Solution in O(a_n log a_n), where a_n is the largest number in the vector:

Compute the polynomial

(x^a₁+x^a₂ + ... + x^a_n)³

you can do it in O(a_n log a_n) using Fast Fourier Transform (first compute square, then third power; see here for description). Observe that after multiplication a coefficient x^b_i was formed from multiplication x^a_i * x^a_j * x^a_k= x^a_i+a_j+a_k for some i,j,k. Check if there is a power x^a_l in the resulting polynomial.

Unfortunately this allows some i,j,k to be used twice. Subtracting 3(x^2a₁+...+x^2a_n)*(x^a₁+...+x^a_n) - 2(x^3a₁+...+x^3a_n) will remove those x^a_i+a_j+a_k.

sdcvvc 2010-06-04 11:37:45

I was about to post your BST solution, but with hashtable instead. If `a < b < c < d` is a restriction, then you have to do something extra, because right now you're allowing some numbers to appear twice.

polygenelubricants 2010-06-04 11:44:26

You can tag the elements in the two trees with indices of elements: {(a_i+a_j, i, j): 1 <= i,j <= n} and {(a_i-a_j, i, j): 1 <= i,j <= n}; when joining the lists, check if the tags are all different.

sdcvvc 2010-06-04 11:51:17

If using a hashtable instead of the tree, you could get rid of O(log n) by using the sum/difference as key. After having computed and inserted all (a+b)-sums, you would just have to check for the differences if there is an element -(d-c) in the (a+b) table. That gives a total runtime of O(n^2).

MicSim 2010-06-04 12:04:43

Assuming you can create a bit array with 2*a_n elements initialized to zero. If not, this solution is O(a_n + n^2).

sdcvvc 2010-06-04 12:10:51

If you use tuples such as (a+b,a,b) for your sums and (b-a,a,b) for your differences, where a < b, in the hashtable as the keys, then you have enforced the `a < b < c < d` restriction.

Justin Peel 2010-06-04 13:49:25

This is most likely a 3SUM-Hard problem. In other words, don't look for better than O(N^2) solutions :-)

Moron 2010-06-04 16:38:48

@MicSim: your algorithm in Python: http://stackoverflow.com/questions/2973418/given-a-vector-of-maximum-10-000-natural-and-distinct-numbers-find-4-numbersa/2979763#2979763

J.F. Sebastian 2010-07-02 19:44:36

Answer 2

+4 A:

There is an algorithm by Shamir and Schroeppel that solves this problem in time O(N^2) and with memory O(N), when N is the number of inputs. It basically is what sdcvvc proposes, but instead of storing the sets {a_i + a_j} as a whole one would repeatedly compute only the sums in appropriate intervals. This saves memory, but does not increase the time complexity.

Richard Schroeppel, Adi Shamir: "A T=O(2^(n/2)), S=O(2^(n/4)) Algorithm for Certain NP-Complete Problems". SIAM J. Comput. 10(3): 456-464 (1981)

abc 2010-06-04 12:36:05

related "A 2010 Algorithm for the Knapsack Problem" http://rjlipton.wordpress.com/2010/02/05/a-2010-algorithm-for-the-knapsack-problem/

J.F. Sebastian 2010-06-04 17:56:29

Answer 3

+1 A:

Here's @MicSim's comment to @sdcvvc's answer implemented in Python:

def abcd(nums):
    sums = dict((a+b, (a,b)) for a, b in combinations(nums, 2))

    for c, d in combinations(sorted(nums), 2): # c < d
        if (d-c) in sums:
            a, b = sums[d-c]
            assert (a+b+c) == d
            if a == c or b == c: continue # all a,b,c,d must be different
            a,b,c = sorted((a,b,c))
            assert a < b < c < d
            return a,b,c,d

Where combinations() could be itertools.combinations() or

def combinations(arr, r):
    assert r == 2 # generate all unordered pairs
    for i, v in enumerate(arr):
        for j in xrange(i+1, len(arr)):
            yield v, arr[j]

It is O(N²) in time and space.

Example:

>>> abcd(range(1, 10000))
(1, 2, 3, 6)

J.F. Sebastian 2010-06-05 09:04:16

Prove it :) :) :)

Hamish Grubijan 2010-06-06 03:58:55

@Hamish Grubijan: `combinations()` produces `N*(N-1)/2` pairs therefore `sums` takes O(N**2) memory and O(N**2) time to create it, and there are O(N**2) (c,d) pairs to process. `(d-c) in sums` is assumed to be O(1) therefore the whole (c,d)-loop is O(N**2) in time.

J.F. Sebastian 2010-06-09 10:48:49

ansaurus

tags:

views:

answers:

Given a vector of maximum 10 000 natural and distinct numbers, find 4 numbers(a, b, c, d) such that a + b + c = d.

related questions