views:

343

answers:

6

My goal here is to find all possible combinations that sums to a given total. For example, if the array is 2 59 3 43 5 9 8 62 10 4 and if the total is 12, then possible combinations are

2 10
3 9
8 4
5 3 4

Here is the first set of code, that I've written. Wondering the best improvements that can be done on this.

   int find_numbers_matching_sum(int *number_coll, int total)
{

    int *search_till = lower_bound(number_coll,number_coll+TOTAL_SIZE, total);
    int location = search_till - number_coll;
    if (*search_till > total && location > 0 )
    {
        --location;
    }

    while ( location >= 0 )
    {
        find_totals(number_coll,total,location);
        --location;
    }
    return 1;
}

int find_totals(int *number_coll, int total, int end_location)
{
    int left_ones = total - number_coll[end_location];
    int curloc = end_location;
    int *search_till = 0;
    int location ;
    int all_numbers[10];
    int i = 0;

    all_numbers[i] = number_coll[end_location];
    while ( left_ones && curloc >= 0 )
    {
        search_till = lower_bound(number_coll,number_coll+end_location, left_ones);
        location = search_till - number_coll;
        if (*search_till > left_ones && location > 0 )
        {
            --location;
        }
        else if ( left_ones < *search_till )
        {
            break;
        }
        curloc=location;
        left_ones = left_ones - number_coll[curloc];
        all_numbers[++i] = number_coll[curloc];
        end_location = curloc - 1;
    }

    if ( !left_ones )
    {
        while ( i>=0)
        {
            cout << all_numbers[i--] << ' ';
        }
    }
    cout << endl;
    return 1;


}
+3  A: 

This is a variation of the NP-complete Knapsack Problem, the subset sum problem. The fullblown Knapsack Problem can be reduced linearly to your problem by succesively lowering N. You will not find an exact algorithm for your problem that runs faster than exponentially in N if P != NP holds.

Polynomial time approximations are known however.

Peter G.
+5  A: 

The problem You describe is also known as the Subset Sum Problem which is NP-complete. The best You can achieve is an exponential time algorithm that tries all possible subsets of Your array/set.

Dave
Actually, as long as the numbers are all positive, as they are in nsivakr's example, the problem is in P (I think). With positive numbers, you know when you're past the target.
Pontus Gagge
@Pontus I have to disagree. The prototypical Knapsack Problem is formulated with nonnegative numbers.
Peter G.
@Pontus Even if the number are positive the problem is NP-Complete. The problem asks for exact sum, if I understood you well.
kunigami
@Peter: Really? The KP requires you to simultaneously maximize a separate value function within a size bound: that's a degree of freedom we don't have here. The positive subset sum problem has two parameters: n and c, where n is the number of items and c the desired sum. I'm getting at least one algorithm polynomically bounded by n and c (paywalled, though): http://bit.ly/90FTVU.
Pontus Gagge
...but I'm also getting other claims of NP (http://bit.ly/ckCGWq). Now I'm thoroughly confused! Can you express the decidability problem in terms of the positive subset sum problem?
Pontus Gagge
@Pontus The algorithm you linked has complexity dependent of c, which makes it a pseudo-polynomial algorithm (I suggest one in my answer), but those are not in P.
kunigami
OK, too hasty of me. I had the subset sum hardness filed under backtracking. Darned. The exact formulation is as always critical. My bad.
Pontus Gagge
+4  A: 

This is Subset-sum problem (NP-Complete), and until P ?= NP, there is only exponential solution.

From xkcd

Slava
@Neil: Come on, we have enough xkcd in joke threads already and this just steals screen estate that might make other answers visible without scrolling.
Georg Fritzsche
+3  A: 

If the values are not big, say your sum is bounded by M, you can use dynamic programming. Suppose there are N items.

Imagine you have a matrix DP[M][N]. The cell DP[m][n] means: how many combinations of the first n elements sum exactly m?

Analyzing each item, you may or may not include it in some combination. Then you get the recurrence (taking care of the out-of-bound values)

DP[m][n] = DP[m][n-1] + DP[m - v[n]][n - 1]

the first term of rhs means you are considering all sums that do not use the nth item and the second term all sums that do use the nth item. You start with the basis DP[0][0] = 1, since the empty set is a valid combination which sums 0. The desired value is in DP[M][N].

This is pseudo-polynomial though, O(MN).

kunigami
+1 the OP's actual problem might fulfill the additional constraints
Dave
This won't really help if he's trying to list the possibilities. This will just count them. Also, you can do this with a single vector, you don't need a matrix.
IVlad
@|Vlad: You're right. So there's no hope since the output may have exponential size. I know it can be done with a vector, but it seemed easier to explain the algorithm with matrix.
kunigami
+2  A: 

This is related to Partition in Number Theory, and it can be solved using dynamic programming.

Let n be the total. Let parts be the list of elements. We assume they are positive integers.

if parts == []
then f(n,parts) = [] 
else let parts = x::queue and f(n,parts) = union(L1, L2)

where:

L1 = f(n, queue)

if n-x>0
then let result = f(n-x, queue) and L2 = concatenation([x], result)
else if n-x==0, L2 = [x]
else L2 = []

This is a typical homework.

wok
+1  A: 
#include <iostream>
#include <vector>

using namespace std;

struct State {
    int v;
    const State *rest;
    void dump() const {
        if(rest) {
            cout << ' ' << v;
            rest->dump();
        } else {
            cout << endl;
        }
    }
    State() : v(0), rest(0) {}
    State(int _v, const State &_rest) : v(_v), rest(&_rest) {}
};

void ss(int *ip, int *end, int target, const State &state) {
    if(target < 0) return; // assuming we don't allow any negatives
    if(ip==end && target==0) {
        state.dump();
        return;
    }
    if(ip==end)
        return;
    { // without the first one
        ss(ip+1, end, target, state);
    }
    { // with the first one
        int first = *ip;
        ss(ip+1, end, target-first, State(first, state));
    }
}

int main() {
    int a[] = { 2,59,3,43,5,9,8,62,10,4 };
    int * start = &a[0];
    int * end = start + sizeof(a) / sizeof(a[0]);
    ss(start, end, 12, State());
}
Aaron McDaid
Love your solution. Thanks.
@nsivakr, Maybe there are further optimizations if you knew something about the bit representation of the numbers. For example, if the target is an odd number, and there is only one odd number in the set, then you know that must be included.The search could then prune where it's easy to prove a solution is impossible.If you expect there will be many duplicates in the set, then there will be much redundancy. The solution would be to modify "ss(ip+1, end, target, state);" such that it skips over not just the first element, but all the elements identical to the first element.
Aaron McDaid