ansaurus

Question

Algorithm to select a set of numbers to reach a minimum total

Answer 1

+1 A:

what do you mean by "best"? Least number of n[i]? smallest standard deviation in the n[i]? Impossible to say unless you define "best"?

Aidan 2010-10-18 07:35:27

example: 1, 17, 5, 11, 6, 16, 4 and M = 14the answer should be 11, 4 to arrive at 15

Aves Liaw 2010-10-18 07:40:06

and what is M? And what is the "best" way to chose the values from this list? Is it the least number, i.e. the smallest subset that adds up to M?

Aidan 2010-10-18 07:42:14

Yes, its the set of numbers where the sum needs to reach or go beyond MIf two sets of numbers achieve the same result, then choosing the set with the least quantity of numbers

Aves Liaw 2010-10-18 07:45:32

this is like a 0-1 knapsack problem except that if you have two solutions with the same number of elements, the one that adds up to M is better than the one that adds up to more than M, so inyour example {11,4} is better than {11,5}. I think your best bet is sorting the elements into a tree, where you can go down the branches to get all the candidtates with the minimum number of elements that add up to M, and then choose the best candidate. Alternatively, you can investigate the knapsack problem : http://en.wikipedia.org/wiki/Knapsack_problem

Aidan 2010-10-18 07:53:16

The problem with using the tree is which number to put at the top of the tree... any suggestions? Or do I have to run through all permutations of the tree with each value on top?

Aves Liaw 2010-10-18 08:00:04

Answer 2

+1 A:

I think that a greedy algorithm approach will work. How many numbers do you expect to have in the set? If it is reasonably low you could try backtrack, but I would not recommend it for large sets.

Chris 2010-10-18 08:23:43

Thanks. I will have a look. Currently, I'm looking at a set with maximum of 136 values (with possibility of growth in the future)

Aves Liaw 2010-10-18 08:36:16

You should go for backtrack, but trying to improve the execution time using some tricks like first sorting the set and then getting in backtrack. Let me know if you need help

Chris 2010-10-18 08:51:14

Currently, backtrack looks like its the best way to go... I need to study it a bit more to see if it'll work. Thanks

Aves Liaw 2010-10-18 09:44:48

It will work just fine, giving you the right results but there is a problem with backtrack: is slooow.

Chris 2010-10-18 09:57:58

Answer 3

A:

I think this is NP-complete (it's not possible to find a fast way of doing it in general). The way you've phrased the question makes me think of solving successive Subset sum problems with a target integer that steadily increases from M.

In your example, there is no known method of determining whether M can be reached. Only a brute-force solution would tell you that it's not possible, and only then could you check M+1.

However, you may like to look at the DP solution, as this will at least give you an idea of how to solve the problem (albeit slowly). There is also an approximate solution which will be correct if your numbers are small. Use this. Finally, it's worth pointing out that the size of the problem is measured in the number of bits, not the size (or sum) of the set.

As mentioned above, this is related to the Knapsack problem.

Dijkstra 2010-10-18 08:41:25

Answer 4

+1 A:

This kind of problems is called binary linear programming (a special case of integer linear programming). It is famously NP-hard – i.e. not efficient to solve in general.

However, there exist good solvers, both commercial and free use, e.g. the Open Source solver lpsolve which you can call from your program.

/EDIT: Old answer was bunk. I confused the factors.

Konrad Rudolph 2010-10-18 08:45:06

I don't think this is a linear programming problem, as you can have 0 or 1 of each number, with a linear programming problem you can have fractional multiples

Nick Fortescue 2010-10-18 08:53:12

@Nick: damn, you’re right. It’s a BLP.

Konrad Rudolph 2010-10-18 08:55:10

Answer 5

A:

A Greedy solution would work A pseudo:

algo(input_set , target_sum, output_set )
   if input_set is NULL
       error "target_sum can never be reached with all input_set"
       return
   the_max = maximum(input_set)
   remove the_max from input_set
   if the_max > target_sum
       algo(input_set, target_sum, ouptput_set )
       return
   add the_max to output_set
   algo(input_set, target_sum - the_max, output_set )

call with output_set = NULL. sort intput_set for efficiency.

Vardhan Varma 2010-10-18 09:05:14

Not the solution for op's problem.

Grozz 2010-10-18 09:10:56

I think this solution only works if I want the values to sum up to exactly target_sum. However, for my case, the total is allowed beyond target_sum.

Aves Liaw 2010-10-18 09:18:24

Answer 6

A:

This problem is almost exactly the subset sum problem, which is related to the Knapsack problem but different, and is NP complete. However, there is a linear solution if all the numbers are smaller than some constant, and a Polynomial time approximation. See the wikipedia page referenced above.

Nick Fortescue 2010-10-18 09:11:59

Answer 7

A:

pseudocode:

list<set> results;
int m;
list<int> a;

// ...    

a.sort();

for each i in [0..a.size]
    f(i, empty_set);    

// ...

void f(int ind, set current_set)
{
    current_set.add(a[ind]);

    if (current_set.sum > m)
    {
        results.add(current_set);
    }
    else
    {
        for (int i=ind + 1; i<a.size; ++i)
        {
            f(i, current_set);  // pass set by value

            // if previous call reached solution, no need to continue
            if (a[ind] + current_set.sum) > m
                break;
        }
    }
}

// choose whatever "best" result you need from results, the one
// with the lowest sum or lowest number of elements

Grozz 2010-10-18 09:18:24

This could work unfortunately does not return the optimal results. For eg M = 21, numbers are 22, 15, 5, 3, 2 ,2, 1. If I'm not wrong, the code would return 2 results [22] and [15, 5, 3, 2, 2]. But the best answer would be [15,5,3,2,1] which would not be considered? Correct me if I'm wrong here.

Aves Liaw 2010-10-18 09:42:23

added a minor fix, but for another problem. that solution should work, it kinda goes over all possible solutions cutting out wrong branches asap. so for your inputs it will return [1, 2, 2, 3, 5, 15] [1, 2, 3, 5, 15] [1, 3, 5, 15] [1, 5, 15] [2, 2, 3, 5, 15] etc. Choose the set with the lowest sum or lowest number of elements or whatever your "best" criteria is.

Grozz 2010-10-18 10:11:12

Managed to translate this to code. Worked well enough. Thanks

Aves Liaw 2010-10-21 02:11:55

Answer 8

+2 A:

This looks like a classic Dynamic Programming problem (also indicated by other answers mentioning its similarity to 0-1 Knapsack and Subset Sum problems). The whole thing boils down to to one simple choice: for each element in the list, do we use it in our sum or not. We can write up a simple recursive function to compute the answer:

f(index,target_sum)=
     0     if target_sum<=0 (i.e. we don't need to add anymore)
     infinity   if target_sum>0 and index is past the length of n (i.e. we have run out of numbers to add)
     min( f(index+1,target_sum), f(index+1,target_sum-n[index])+n[index] )    otherwise (i.e. we explore two choices -  1. take the current number 2. skip over the current number and take their minimum)

Since this function has overlapping subproblems (it explores the same sub-problems over and over again) , it is a good idea to memoize the function with a cache to hold values that were already computed before.

Here's the code in Python:

#! /usr/bin/env python

INF=10**9 # a large enough number of your choice

def min_sum(numbers,index,M, cache):
    if M<=0: # we have reached or gone past our target, no need to add any more
        return 0
    elif len(numbers)==index: # we have run out of numbers, solution not possible
        return INF
    elif (index,M) in cache: # have been here before, just return the value we found earlier
        return cache[(index,M)]
    else:
        answer=min(
            min_sum(numbers,index+1,M,cache), # skip over this value
            min_sum(numbers,index+1,M-numbers[index],cache)+numbers[index] # use this value
        )
        cache[(index,M)]=answer # store the answer so we can reuse it if needed 
        return answer 

if __name__=='__main__':
    data=[10,6,3,100]
    M=11

    print min_sum(data,0,M,{})

This solution only returns the minimum sum, not the actual elements used to make it. You can easily extend the idea to add that to your solution.

MAK 2010-10-18 10:29:57

ansaurus

tags:

views:

answers:

Algorithm to select a set of numbers to reach a minimum total

related questions