ansaurus

Question

How can I improve this algorithm for solving a modified Postage Stamp problem?

Answer 1

+1 A:

Thanks for a very interesting question! I spent a few minutes understanding the question.

I did not see a notational formulation of the problem, so, let me give a shot at coming up with a notation here.

Firstly, an observation (as you made correctly as well), one of the regions must be 1, otherwise 1 would not be attainable.

Secondly, since we are trying to MAXIMIZE the unattainable score, no point in repeating region values.

So, that gives a degenerate (but not optimal) solution: 1, 2, 3, ... R

The objective function value of this degenerate solution is: D*R+1

For example if you have D = 4 darts, and you color your dartboard with scores 1, 2. ..40, then every score from values 1 to 160 is attainable. 161 is not attainable.

Clearly this solution is not optimal, and optimal solution will involve possibly some powers of 2 and definitely some thought.

Now for the notation:

Let f(X,D,{Y_1..Y_R}) =

1 if a score of X is attainable using D darts on a dartboard with ranges Y_1...Y_R
0 if unattainable

As an example discussed earlier. f(160,4,{1..40}) = 1 and f(161,4,{1..40}) = 0

The objective function value of puzzle can then be denoted as:

g(D, (Y_1..Y_R}) = Min X | f(X,D,{Y_1..Y_R}) = 0

By observation 1 earlier, we can assume that Y_1 = 1.

Also, a recursive formulation of function f can be as follows.

f(X,D,{1..Y_R} = 1 if:

X=0, or
f(X-Y_j,D-1,{1..Y_R}) = 1 for some j

[Will continue to work on this and post more as I develop it.]

Amrinder Arora 2010-09-27 15:55:55

thanks @Amrinder! I think this is moving towards a dynamic programming solution, which is awesome, as I an cut down on so much of calculation.

Lazer 2010-09-27 19:50:38

Answer 2

+2 A:

EDIT: Here is the code I used for this problem. Sorry for the mess.

I actually participated in this contest, as you notice I placed 100th overall with 87.00 points collected. I actually used your method because I realized generating a "reasonable" solution for every problem was the first hurdle to overcome. At the time I ran it, the points generated were enough to put me up around 94 points but as the top earners improved their scores, that number fell quickly to around 75.

The first and easiest thing you can do is to realize that your program runs in an instant, and if it doesn't you should spend time optimizing the crap out of that first. Once it runs in fast enough time, you can generate every possible set for D = 3, R <= 5 in no time at all. You can then use the sets at R = 5 as seeds for your greedy algorithm. Now instead of one greedy solution, you have a few thousand, and you just need to keep track of the highest value generated at each level R and the values which create it. That is easy enough to do, and this will get your score up above 80.

I spent almost a month optimizing the function which can test any random input set so that I was able to pump up my seed generator to R = 10 and run it in about 8 hours on a single core. This guaranteed having the best possible solution for 'D = 3', 'R <= 10' and much better answers when R > 10 than I had with the original greedy result. This got my score pretty close to where it ended after running R up as high as possible for each D while still being able to run the program in a single day. I was able to reach R = 9 for D = 4, R = 8 for D = 5, and R = 8 for D = 6.

After these were run I calculated that I wouldn't be able to increase R by 1 for any D over the numbers just listed and have it complete its execution in my lifetime. Obviously it was time to look for a new technique. I figured I was doing a great job on the front end by test every possible starting segment, so why not shift some of that to the back end by testing deeper result sets for each of my starting segments. Instead of grabbing the best next result on the same level, I performed a 2 level look ahead and take the best next value from two levels deep. For instance, you always start with a 1, then you enumerate all values for R = 2 (2, 3, 4) and then for each of those, evaluate all possible values for R = 3. So 2 (3, 4, 5, 6, 7), 3 (4, 5, 6, 7, 8), 4 (5, 6, 7). Take the highest of all those evaluations, and then choose the value at R = 2 which contains the highest value for R = 3. This required a lot more processing time and required me to lower the max R I could use to seed it, but it produced better results for the deeper searches.

At this point I gave up on greedy approaches and used this competition to expand my AI knowledge. I tried using various monte carlo techniques as well as a basic genetic algorithm. I learned a lot about monte carlo, and in looking up some of the top performers in this competition, found publications on optimizations for monte carlo selection criteria which was beyond my understanding. I wish I could help you out further, but I feel confident in arguing that breaking 90 points with a greedy approach is very unlikely in my lifetime. I would be interested to see how much better the answers would get if it were multithreaded and a bunch of power was thrown at it.

I don't have any of the work I did for this problem with me, but I remember showing that look ahead and greater enumeration of starting seeds were the only two possible improvements to the greedy algorithm for this problem. I'll for that stuff tonight and post the thought process here if I can find the notes.

NickLarsen 2010-09-27 17:24:36

thanks @NickLarsen! Doing a brute force up to a certain level, and then going on from there is a optimization that I should have used. I went with the same technique for each level. Also, I never tried any AI. How much improvements did you achieve with them? Also, please do post if you find those notes.

Lazer 2010-09-27 19:45:54

Hey @NickLarsen, any luck finding those notes?

Lazer 2010-10-04 19:09:16

I could not find the notes, they probably got thrown out when we moved. I was able to find the code I wrote for this problem though, if you'd like, I'd be more than happy to upload it somewhere for you.

NickLarsen 2010-10-05 13:02:11

I set an alarm to post the code tonight. Should be up around 7PM EST.

NickLarsen 2010-10-08 12:41:12

Code link added at top of post.

NickLarsen 2010-10-11 12:30:20

Answer 3

A:

Once you brute force a few, your might see some patterns to inform heuristic searches. For instance, a lot of the top solutions have a pattern like this one for D=3, R=40: a run of small increases, a run of larger increases, then a 2x jump followed by a short run of small increased.

At least, it tells you not to go with the subset idea, where the 3x30 values are a subset of 3x40, for instance.

alt text

xan 2010-10-04 19:06:48

Answer 4

+1 A:

The maximum of smallest unachievable element is good to look for only in the last iteration. Better primary subgoal is the number of achievable elements with given number of darts. One interesting subgoal might be

Nae * (Rt - Rc) / Rt + Msue, where

Nae - number of achievable elements (with a given number of darts)
Rt - total available regions
Rc - currently used regions (current level of recursion)
Msue - maximum of smallest unachievable element

Why is Nae more valuable than Msue in early iterations? The more achievable elements we have early on, all subsequent elements would me more employable, and produce more combinations, and reach even more achievable elements. With the explosion of Nae there comes high probability that Msue will rise to a good level as well. However, in late iterations Msue becomes more important and new elements are more used to "plug the holes", with the hope that the last hole to plug will be farthest away possible.

Physical analogy of this rationale would be olympic long jumping, where the athlete at the start of the run first accumulates momentum, but as he approaches the foul line he synchronizes his steps so that the actual jump becomes most effective. The athlete does not synchronize his steps from the start of the run, because momentum is more important in that stage.

Dialecticus 2010-10-07 23:11:40

I like the idea, but can you show why that works? The reasoning doesn't just jump out at me.

NickLarsen 2010-10-08 12:40:34

Added some explanation. To truly show the effect I would need to write some code, which OP already has. If he tries this approach he might see the effects and tell us about it :-)

Dialecticus 2010-10-08 14:05:22

ansaurus

tags:

views:

answers:

How can I improve this algorithm for solving a modified Postage Stamp problem?

related questions