ansaurus

Question

Function approximation

Answer 1

A:

Neural networks :D or Taylor series ?

Hassan Syed 2009-12-11 14:23:35

Wouldn't it need insane amounts of data to actually converge?

devoured elysium 2009-12-11 14:26:28

I do know Taylor series. But how would they help me here?

devoured elysium 2009-12-11 14:27:18

Depends entirely on the problem. If you don't mind training it iteratively than you can simply start with an over-fitted network and train it iteratively. You could also write the network or series by hand if you understand the problem well enough.

Hassan Syed 2009-12-11 14:28:48

He doesn't know P() analytically. How would he use Taylor series? This doesn't make sense to me.

Paul 2009-12-11 17:16:47

He does know P by the impractical means and he has access to the function. He wants to make the problem tractable.

Hassan Syed 2009-12-11 20:01:11

Answer 2

+2 A:

Maybe a significant portion of your algorithm is parallelizable? If so, have you considered parallelizing your code?

jkndrkn 2009-12-11 14:24:50

I wouldn't want to go that way. I never did parallization of anything, and I don't have that much time to learn.

devoured elysium 2009-12-11 14:25:40

You say you haven't much time to spend learning, yet you're talking about optimization techniques. If you have a bunch of processors available, you could be brute-forcing this function with all of them with maybe an hours worth of study and development. This is almost exactly like a common MPI example, computing PI by throwing darts.

Novelocrat 2009-12-11 14:44:48

Answer 3

+2 A:

Look at the various stochastic optimization techniques listed here. I recommend simulated annealing.

Andreas Brinck 2009-12-11 14:37:50

Answer 4

+2 A:

There are plenty of well-known global optimization algorithms (simulated annealing, stochastic tunneling, etc...) that CAN find the global maximum, but none are guaranteed to find it within a reasonable amount of time without making assumptions about the shape of the function.

You're not going to find a fast/easy way to optimize a 100-dimensional, non-trivial function. You'll need a lot of processing power and time. Assuming you don't want to write optimization code yourself (based on your question), you'll also need some good math software (eg. Mathematica).

Donnie DeBoer 2009-12-11 14:50:52

Answer 5

A:

Hi

Hmmm, not sure that I agree at all that GAs are 'generally' used. And since you fail to describe your problem in any detail it's difficult to provide specific recommendations. Your problem might be amenable to, for example, integer programming (a variant of linear programming for integer valued problems). Or you might get a good enough approximation by using linear programming and rounding the outputs to their nearest integers.

Can you approximate P by any 'easy' functions ? Where 'easiest' would be polynomials, and 'easy' would be continuous and differentiable. In other words could you analytically solve an approximation to P ?

Whatever, the more you know, and care to tell us, about P as a function of the inputs, the better our help might be. For some reason you know that P can return better values than you are currently getting -- what is that reason ?

Finally, you might make some progress by Googling for software for operations research, optimisation, linear programming, that sort of key word.

Regards

Mark

High Performance Mark 2009-12-11 15:30:26

Answer 6

+2 A:

Another not fully serious answer, but food for thought:

This problem looks to be so big that by rights you should need something like a SETI@Home effort to solve it. Thousands of computers make reasonably light work of this kind of thing. But I'm not sure how you'd reach thousands of computer users to obtain the use of their computers.

Actually, I do. Please bear with me for a moment in disregarding the legality of it all.

There are botnets run by some folks hiding behind the former Iron Curtain. I recently saw an offer to rent a botnet for $70 for 24 hours. Just think, thousands of 0wned PCs ready to do your bidding! Instead of having them DDOS Internet sites, you could have them churning on your problem. :)

Two final bits of advice on this, though:

Don't pay them with your own credit card :)
Don't take legal advice from strangers on SO :)

Good luck!

Carl Smotricz 2009-12-11 15:42:49

Answer 7

+4 A:

Simulated annealing, closely related to Markov Chain Monte Carlo (MCMC). The variant you probably want is Metropolis-Hastings. When you get the hang of it, it's quite nice. Possibly there are some ways to optimize it because your inputs and result are all integers. It is compute-intensive and may require some tuning, but it is quite robust, and I'm not sure other methods could do any better.

Here's some brain-dead code to do it:

const int n = 100; // length of vector to optimize
int a[n]; // the vector to optimize
double P(a){..} // Get the probability of vector a.
                // This is the function to optimize.
// for a large number of a samples
for (i = 0; i < large_number; i++){
  // get P(a)
  double p = P(a);
  // for each element of vector a
  for (j = 0; j < n; j++){
    // get an amount by which to change it. This choice has to be symmetric.
    // this is called the Proposal Distribution
    int step = uniform_random_choice_from(-2, -1, 1, 2);
    // make the change to a[j], and get p1, the new value of p
    a[j] += step;
    double p1 = P(a);
    bool bKeepTheStep = true;
    // if p1 is better than p, keep the step
    // if p1 is worse than p, then keep the step p1/p of the time
    if (p1 < p){
      bKeepTheStep = (unif(0,1) < p1/p);
    }
    if (bKeepTheStep) p = p1;
    else a[j] -= step;
  }
  // now a is a sample, and p is its value
  // record a and p
}
// what you have now is a large random sampling of vectors from distribution P
// now you can choose the best one, the average, the variance,
// any statistic you like

Ways to tweak it are to widen or narrow the proposal distribution, so it takes larger or smaller steps, or you can have it initially take larger steps and then smaller steps. What you're looking for is a percentage of steps that are kept that is neither too high nor too low. You probably want to have a "burn-in" phase of an initial 1k or so samples that you throw away, while it hunts for the area of the mode.

And by all means, profile P. It needs to be as fast as possible. Here's my favorite way to do that.

Mike Dunlavey 2009-12-11 16:05:50

Answer 8

A:

If you have access to matlab, you can parallelize your code pretty fast and pretty easily. Even it can make simple linear for loops parallell with its parfor loop

Midhat 2009-12-11 16:09:43

Answer 9

A:

If a Microsoft solution is an option, check out Solver Foundation. I heard about on Scott Hanselman's podcast (#191).

Austin Salonen 2009-12-11 16:45:04

Answer 10

+1 A:

As a first line algorithms for this type of problem, I would recommend Simulated Annealing. SA is a great first choice because you can clearly control your starting point and run time.

If you know something about the structure of your 100-dimensional space, with SA you can choose a good starting point and that can have a big impact on the quality of your result. Also with SA you can control the 'cooling rate' which impacts both runtime and the quality of your results - naturally in opposite directions. I typically run with a relatively fast cooling rate first to look for good starting vectors, and then slowed the cooling rate in subsequent runs to improve results. Kind of a meta-SA technique that can be automated.

I've used SA successfully to maximize very high dimensional function used in modeling neutron proton interactions in the past.

Also, I would look to dimensionally reduce P() if possible. For your particular problem, are all 100 variables needed? If you can fix 1/2 of those you'll speed up any optimizer and end up with better results.

(And SA is easy to implement.)

Paul 2009-12-11 17:08:08

Answer 11

+1 A:

Assumptions:

First - the variables must be integer.
Second - the objective function P() is non-linear.

Observation:

In general, non-linear integer programming is very difficult to solve. In reality, as recommended above, rounding a solution by relaxing the integer restriction may help.

There are general unconstrained optimization techniques available. One approach that comes from experimental design is call 'response surface methodology'. Very helpful when the cost of an experiment is significant. The approach is to run a set of experiments by starting with one point and deviating each of your inputs by a set increment. You then calculate the gradient for each input and take a step in that direction for each, then repeat. Fletcher - Practical Methods of Optimization and Box Hunter & Hunter Statistics for Experimenters is place to look.

Grembo 2009-12-29 19:56:40

ansaurus

tags:

views:

answers:

Function approximation

related questions