views:

257

answers:

11

I have a function,

P(x0, x1, ..., xn)

that takes 100 integers as input and gives as output an integer. P is a slow function to evaluate(it can range from 30 seconds to a couple of minutes).

I need to know which values of points will maximize the yielded value from P.

What techniques can I use to accomplish this? I know generally people use genetic algorithms for this, but I'm afraid it will take ages to compute it with them, as even with a small population and few generations (let's say, population = 50, generations = 50), P is so slow it will take more than 40 hours to compute it.

Is there any cheaper method of doing it? Maybe an iterative process? I don't need it to be really optimal, but I don't have any ideia of how it behaves (I've tried linear / quadratic / exponential but it doesn't seem to yield any good values. I know P can return values at least 5-10 times better than what I'm getting).

It should be something that's easier to implement (i.e., I must implement it myself).

Thanks

edit: P is a stochastic process.

A: 

Neural networks :D or Taylor series ?

Hassan Syed
Wouldn't it need insane amounts of data to actually converge?
devoured elysium
I do know Taylor series. But how would they help me here?
devoured elysium
Depends entirely on the problem. If you don't mind training it iteratively than you can simply start with an over-fitted network and train it iteratively. You could also write the network or series by hand if you understand the problem well enough.
Hassan Syed
He doesn't know P() analytically. How would he use Taylor series? This doesn't make sense to me.
Paul
He does know P by the impractical means and he has access to the function. He wants to make the problem tractable.
Hassan Syed
+2  A: 

Maybe a significant portion of your algorithm is parallelizable? If so, have you considered parallelizing your code?

jkndrkn
I wouldn't want to go that way. I never did parallization of anything, and I don't have that much time to learn.
devoured elysium
You say you haven't much time to spend learning, yet you're talking about optimization techniques. If you have a bunch of processors available, you could be brute-forcing this function with all of them with maybe an hours worth of study and development. This is almost exactly like a common MPI example, computing PI by throwing darts.
Novelocrat
+2  A: 

Look at the various stochastic optimization techniques listed here. I recommend simulated annealing.

Andreas Brinck
+2  A: 

There are plenty of well-known global optimization algorithms (simulated annealing, stochastic tunneling, etc...) that CAN find the global maximum, but none are guaranteed to find it within a reasonable amount of time without making assumptions about the shape of the function.

You're not going to find a fast/easy way to optimize a 100-dimensional, non-trivial function. You'll need a lot of processing power and time. Assuming you don't want to write optimization code yourself (based on your question), you'll also need some good math software (eg. Mathematica).

Donnie DeBoer
A: 

Hi

Hmmm, not sure that I agree at all that GAs are 'generally' used. And since you fail to describe your problem in any detail it's difficult to provide specific recommendations. Your problem might be amenable to, for example, integer programming (a variant of linear programming for integer valued problems). Or you might get a good enough approximation by using linear programming and rounding the outputs to their nearest integers.

Can you approximate P by any 'easy' functions ? Where 'easiest' would be polynomials, and 'easy' would be continuous and differentiable. In other words could you analytically solve an approximation to P ?

Whatever, the more you know, and care to tell us, about P as a function of the inputs, the better our help might be. For some reason you know that P can return better values than you are currently getting -- what is that reason ?

Finally, you might make some progress by Googling for software for operations research, optimisation, linear programming, that sort of key word.

Regards

Mark

High Performance Mark
+2  A: 

Another not fully serious answer, but food for thought:

This problem looks to be so big that by rights you should need something like a SETI@Home effort to solve it. Thousands of computers make reasonably light work of this kind of thing. But I'm not sure how you'd reach thousands of computer users to obtain the use of their computers.

Actually, I do. Please bear with me for a moment in disregarding the legality of it all.

There are botnets run by some folks hiding behind the former Iron Curtain. I recently saw an offer to rent a botnet for $70 for 24 hours. Just think, thousands of 0wned PCs ready to do your bidding! Instead of having them DDOS Internet sites, you could have them churning on your problem. :)

Two final bits of advice on this, though:

  • Don't pay them with your own credit card :)
  • Don't take legal advice from strangers on SO :)

Good luck!

Carl Smotricz
+4  A: 

Simulated annealing, closely related to Markov Chain Monte Carlo (MCMC). The variant you probably want is Metropolis-Hastings. When you get the hang of it, it's quite nice. Possibly there are some ways to optimize it because your inputs and result are all integers. It is compute-intensive and may require some tuning, but it is quite robust, and I'm not sure other methods could do any better.

Here's some brain-dead code to do it:

const int n = 100; // length of vector to optimize
int a[n]; // the vector to optimize
double P(a){..} // Get the probability of vector a.
                // This is the function to optimize.
// for a large number of a samples
for (i = 0; i < large_number; i++){
  // get P(a)
  double p = P(a);
  // for each element of vector a
  for (j = 0; j < n; j++){
    // get an amount by which to change it. This choice has to be symmetric.
    // this is called the Proposal Distribution
    int step = uniform_random_choice_from(-2, -1, 1, 2);
    // make the change to a[j], and get p1, the new value of p
    a[j] += step;
    double p1 = P(a);
    bool bKeepTheStep = true;
    // if p1 is better than p, keep the step
    // if p1 is worse than p, then keep the step p1/p of the time
    if (p1 < p){
      bKeepTheStep = (unif(0,1) < p1/p);
    }
    if (bKeepTheStep) p = p1;
    else a[j] -= step;
  }
  // now a is a sample, and p is its value
  // record a and p
}
// what you have now is a large random sampling of vectors from distribution P
// now you can choose the best one, the average, the variance,
// any statistic you like

Ways to tweak it are to widen or narrow the proposal distribution, so it takes larger or smaller steps, or you can have it initially take larger steps and then smaller steps. What you're looking for is a percentage of steps that are kept that is neither too high nor too low. You probably want to have a "burn-in" phase of an initial 1k or so samples that you throw away, while it hunts for the area of the mode.

And by all means, profile P. It needs to be as fast as possible. Here's my favorite way to do that.

Mike Dunlavey
A: 

If you have access to matlab, you can parallelize your code pretty fast and pretty easily. Even it can make simple linear for loops parallell with its parfor loop

Midhat
A: 

If a Microsoft solution is an option, check out Solver Foundation. I heard about on Scott Hanselman's podcast (#191).

Austin Salonen
+1  A: 

As a first line algorithms for this type of problem, I would recommend Simulated Annealing. SA is a great first choice because you can clearly control your starting point and run time.

If you know something about the structure of your 100-dimensional space, with SA you can choose a good starting point and that can have a big impact on the quality of your result. Also with SA you can control the 'cooling rate' which impacts both runtime and the quality of your results - naturally in opposite directions. I typically run with a relatively fast cooling rate first to look for good starting vectors, and then slowed the cooling rate in subsequent runs to improve results. Kind of a meta-SA technique that can be automated.

I've used SA successfully to maximize very high dimensional function used in modeling neutron proton interactions in the past.

Also, I would look to dimensionally reduce P() if possible. For your particular problem, are all 100 variables needed? If you can fix 1/2 of those you'll speed up any optimizer and end up with better results.

(And SA is easy to implement.)

Paul
+1  A: 

Assumptions:

First - the variables must be integer.
Second - the objective function P() is non-linear.

Observation:

In general, non-linear integer programming is very difficult to solve. In reality, as recommended above, rounding a solution by relaxing the integer restriction may help.

There are general unconstrained optimization techniques available. One approach that comes from experimental design is call 'response surface methodology'. Very helpful when the cost of an experiment is significant. The approach is to run a set of experiments by starting with one point and deviating each of your inputs by a set increment. You then calculate the gradient for each input and take a step in that direction for each, then repeat. Fletcher - Practical Methods of Optimization and Box Hunter & Hunter Statistics for Experimenters is place to look.

Grembo