I'm in search of an algorithm, which can handle the problem described below. I already have written an algorithm (which is too specialised to post, I think), optimised as much as I could think of, but on larger sets of numbers it still is too slow (as the costs rise exponentially). The solution should take no longer than 5s on a decent computer.
You're given a set of numbers, e.g.:
M = { 1, 1, 1, 2, 2, 2, 5, 5, 5, 10, 10, 10, 10, 20, 50, 50, 50, ..., 10000, 10000, 20000, 20000 }
The do not have to have a special structure (although they have here).
You're given a set of "target points", also numbers, e.g.:
P = { 670, 2010, 5600, 10510, 15000}
The goal is, to take the least quantity of numbers out of M, where, when you add them in the decided order, you get intermediate results as close as possible to all of the points in P. You can only use each number in M once.
In our example, a possible solution would be (although I don't know if it's the best):
Y = ( 500, 100, 50; 1000, 200, 200; 2000, 1000, 500; 5000; 2000, 2000)
As you can see, the two criteria least and close for some sort of tradeoff. That's why my current algorithm uses scoring to find the "best" solution.
Here's how it currently works:
- sort M, sort P, ascending
- remove numbers too small to relevantly change score or numbers which are simply too large
- recursively:
- take the next point in P as the current "target", plus minus e.g. 10%
- add next number out of M, remove it out if M
- when near the target point, goto 4. If at the end point, calculate score of current distribution and possibly remember it
- else goto 5
- when coming back from trying number, take next higher number
It never tries two same numbers and only tries the ascending order, e.g.:
- 100, 100, 100, 50, 50, 20, 10
- 100, 100, 100, 50, 50, 20, 20
- 100, 100, 100, 50, 50, 50, 10
- 100, 100, 100, 50, 50, 50, 20
- 100, 100, 100, 50, 50, 50, 50
- 100, 100, 100, 100
- 100, 100, 100, 100, 10
- 100, 100, 100, 100, 20
- ...
With about 5 of each number, and removing many of the smaller numbers, the algorithm is really fast and finds a good solution. But as I add more numbers or especially include smaller numbers, the runtime rises from 100ms to infinity.
Can you give me a hint, how to deal with this problem? Are there any similar algorithms in literature which can handle the problem or a part of it?