views:

614

answers:

6

There are two integer sequences A[] and B[] of length N,both unsorted.

Requirement: through the swapping of elements between A[] and B[]( can randomly exchange, not with same index), make the difference between {the sum of all elements in A[]} and {the sum of all elements in B[]} to be minimum.

PS: actually,it is an interview question I encountered.

Many thanks

+1  A: 

Try being greedy about it. Given such limited information, I'm not sure what else one could put out there.

JB King
+5  A: 

This is going to be NP-hard! I believe you can do a reduction from Subset Sum to this.

As per BlueRaja/polygene's comments, I will try to provide a full reduction from Subset Sum.

Here is a reduction:

Subset Sum problem: Given integers x1, x2, ..., xn, is there some non-empty subset which sums to zero?

Our problem: Given two integer arrays of size k, find the minimum possible difference of the sum of the two arrays, assuming we can shuffle around the integers in the arrays, treating both arrays as one array.

Say we had a polynomial time algo for our problem.

Say now you are given integers T = {x1,x2, ...,xn} (multiset)

Let Si = x1 + x2 + ...+ xn + xi.

Let Ti = {x1, x2, ..., xi-1, xi+1, ..., xn } ( = T - xi)

Define

Ai = Array formed using Ti

Bi = [Si, 0, ..., 0] (i.e one element is Si and rest are zeroes).

Let mi = the min difference found by our problem for arrays Ai and Bi

(we run our problem n times).

Claim: Some non-empty subset of T sums to zero if and only if, there is some i, for which mi = 0.

Proof: (wlog) say x1 + x2 + .. + xk = 0

Then

A = [xk+1, ..., xn, 0, ...0]

B = [x2, x3, ..., xk, S1, 0, ..0]

gives the minimum difference m1 to be |x2 + .. + xk + (x1 + ... + xn) + x1 - (xk+1 + .. + xn)| = |2(x1+ x2 + .. xk)| = 0.

Similarly the if part can be proved.

In fact, this actually also follows (more easily) from Partition too: just create new array with all zeroes.

Hoepfully I haven't made any mistakes.

Moron
This is equivent to having the contents of A and B in a single collection, then performing bin packing, which is np-hard.
fatcat1111
@fatcat1111: You need to go the other way for proving NP-Hardness. Take a hard problem and reduce that to the current problem. In this case, Subset sum is easily reduced to this problem.
Moron
@Moron: "easily reduced" how? Show us.
polygenelubricants
@polygenelubricants: I have edited the answer to point to wiki page which has this.
Moron
This is not the partition problem - the sizes of the "halves" are fixed in this case.
BlueRaja - Danny Pflughoeft
@BlueRaja, @polygenelubricants: I have edited the post to show the reduction.
Moron
+3  A: 

Take any instance of the NP-complete partition problem:

Partition a multiset A of positive integers into two multisets B and C with the same sum

like {a1,a2,...,an}. Add n zeroes {0,0,0...,0,a1,...,an} and ask if the set can be partitioned into two multisets A and B with the same sum and same number of elements. I claim these two conditions are equivalent:

  • If A and B are a solution to the problem, then you can strike out the zeroes and get a solution of partiton problem.
  • If there is a solution to the partition problem, for example ai1 + ai2 + ... aik = aj1 + ... +ajl where {ai1, ai2, aik, aj1, ..., ajl} = {a1, ... , an} then obviously k+l = n. Add l zeroes to the left side and k zeroes to the right side and you'll get 0 + ... + 0 + ai1 + ai2 + ... aik = 0 + ... + 0 + aj1 + ... +ajl, whichi is a solution of your problem.

So, this is a reduction (so the problem is NP-hard) and the problem is NP, so it is NP-complete.

sdcvvc
Yes, it is the partition problem with 2 partitions that must have the same number of elements in each partition. Thus, it is NP-Complete.
Justin Peel
@Justin: This is not the partition problem, for the reasons you just mentioned. @sdcvvc: You gave a reduction from this to partition problem (which proves nothing), not the other way around. For instance, say we have `{1, 2, 3, 4, 5, 15}`. A solution to this problem will give us `{1, 2, 15}, {3, 4, 5}`. Is there any way to use that information to determine if there are two subsets with the same sum?
BlueRaja - Danny Pflughoeft
@BlueRaja, ok, I think I must be misunderstanding what the problem being asked is then. I'll look over it again.
Justin Peel
BlueRaja, thanks, I changed the reduction.
sdcvvc
+1 you convinced me that this is NP-complete. This answer deserves a ✔.
BlueRaja - Danny Pflughoeft
Similar to my response :)
Eyal Schneider
A: 

I'm not sure that this will ensure the minimum possible distance, but the first thing that comes to mi mind is something like this:

int diff=0;
    for (int i = 0; i<len; i++){
        int x = a[i] - b[i];
        if (abs(diff - x) > abs(diff + x)){
             swap(a,b,i);
             diff-=x;
        }else{
             diff+=x;
        }

    }

assuming that you have a swap function which takes the two arrays and exchanges the items at position i :)

computing and adding the difference between the two values at position i you get the incremental difference between the sums of the elements of the two arrays.
at each step you check if it's better to add (a[i]-b[i]) or (b[i]-a[i]). if the b[i]-a[i] it's the case, you swap the elements at position i in the arrays.

Maybe this will not be the best way, but it should be a start :)

garph0
+1  A: 

The problem is NP-Complete.

We can reduce the partition problem to the decision version of this problem, i.e. given two arrays of ints of the same size, determine whether items can be swapped so that the sums are equal.

The input to the partition problem: a set S of integers, of size N

In order to transform this input into an input to our problem, we define A to be an array of all items in S, and B an array of the same size, with B[i]=0 for all i. This transformation is linear in the input size.

It is clear that our algorithm applied on A and B returns true if and only if there is a partition of S into 2 subsets such that the sums are equal.

Eyal Schneider
+2  A: 

"sequences A[] and B[] of length N" -> does this mean both A and B are each of length N?

(For the purpose of clarity I am using 1-based arrays below).

If so, how about this:

  1. Assume A[1..N] and B[1..N]
  2. Concatenate A and B into a new array C of length 2N: C[1..N] <- A[1..N]; C[N+1 .. 2N] <- B[1..N]
  3. Sort C in ascending order.
  4. Take the first pair of numbers from C; send the first element (C[1]) to A[1] and second element (C[2]) to B[1]
  5. Take the second pair of numbers from C; this time send the second element (C[4]) to A[2] and the first element (C[3]) to B[2] (the order of elements in the pair sent to A and B is the opposite of 3)
  6. ... repeat 3 and 4 until C is exhausted

The observation here is that, in a sorted array, an adjacent pair of numbers will have the smallest difference (compared to a pair of numbers from non-adjacent positions). Step 3 ensures that A[1] and B[1] consists of a pair of numbers with the least possible difference. Step 4 ensures that (a) A[2] and B[2] consist of a pair of numbers with the least possible difference (from the available numbers) and also (b) that the difference is opposite in sign from step 3. By continuing like this, we are ensuring that A[i] and B[i] contain numbers with the least possible difference. Also, by flipping the order in which we send elements to A and B, we are ensuring that the difference changes sign for each successive i.

Parijat