views:

403

answers:

8

Hi

I was wondering if anyone had any suggestions for minimizing a function, f(x,y), where x and y are integers. I have researched lots of minimization and optimization techniques, like BFGS and others out of GSL, and things out of Numerical Recipes. So far, I have tried implenting a couple of different schemes. The first works by picking the direction of largest descent f(x+1,y),f(x-1,y),f(x,y+1),f(x,y-1), and follow that direction with line minimization. I have also tried using a downhill simplex (Nelder-Mead) method. Both methods get stuck far away from a minimum. They both appear to work on simpler functions, like finding the minimum of a paraboloid, but I think that both, and especially the former, are designed for functions where x and y are real-valued (doubles). One more problem is that I need to call f(x,y) as few times as possible. It talks to external hardware, and takes a couple of seconds for each call. Any ideas for this would be greatly appreciated.

Here's an example of the error function. Sorry I didn't post this before. This function takes a couple of seconds to evaluate. Also, the information we query from the device does not add to the error if it is below our desired value, only if it is above

double Error(x,y)
{
  SetDeviceParams(x,y);
  double a = QueryParamA();
  double b = QueryParamB();
  double c = QueryParamC();
  double _fReturnable = 0;
  if(a>=A_desired)
  {
    _fReturnable+=(A_desired-a)*(A_desired-a);
  }
  if(b>=B_desired)
  {
    _fReturnable+=(B_desired-b)*(B_desired-b);
  }
  if(c>=C_desired)
  {
    _fReturnable+=(C_desired-c)*(C_desired-c);
  }
  return Math.sqrt(_fReturnable)
}
+3  A: 

Have you looked at genetic algorithms? They are very, very good at finding minimums and maximums, while avoiding local minimum/maximums.

Daniel
Well, they get gradually better, one generation at a time!
Daniel Earwicker
They are non-linear though :)
Indeera
+2  A: 

If it's an arbitrary function, there's no neat way of doing this.

Suppose we have a function defined as:

f(x, y) = 0 for x==100, y==100
          100 otherwise

How could any algorithm realistically find (100, 100) as the minimum? It could be any possible combination of values.

Do you know anything about the function you're testing?

Jon Skeet
f(x,y) is a calibration error function. There are two parameters that I am interested in. Both are affected by changes in x and y. The function is fairly continuous, but its derivative may not be because as soon as either of the parameters is in spec, I just set it to 0
Tim
@Jon Skeet: That is of course assuming that the function could be *anything*, which indeed forces you to try every combinations of (x, y). If you know the function is pseudo-continuous, things are simplified hugely.
Noldorin
@Noldorin: That's why I asked what the OP knew about the function. At the time I posted, the function I gave would have satisfied everything we knew.
Jon Skeet
@Jon Skeet: Indeed, the function you posted was an example of a fairly arbitrary function. However, it's very unlikely that the function is *not* completely arbitrary (as we know it now can't be, being an error function), meaning a better approach is bound to be possible.
Noldorin
+2  A: 

How do you define f(x,y) ? Minimisation is a hard problem, depending on the complexity of your function.

Genetic Algorithms could be a good candidate.

Resources:

Genetic Algorithms in Search, Optimization, and Machine Learning

Implementing a Genetic Algorithms in C#

Simple C# GA

Indeera
Do you have any suggestions on good resources to get started with genetic algorithms?
Tim
There are loads of books on this subject. What got me started was the Goldberg book. http://www.amazon.com/Genetic-Algorithms-Optimization-Machine-Learning/dp/0201157675
Indeera
+4  A: 
Brian Ramsay
Why are my links broken? It doesn't look like this on the preview.
Brian Ramsay
+1  A: 

What you are generally looking for is called an optimisation technique) in mathematics. In general, they apply to real-valued functions, but many can be adapted for integral-valued functions.

In particular, I would recommend looking into non-linear programming and gradient descent. Both would seem quite suitable for your application.

If you could perhaps provide any more details, I might be able to suggest somethign a little more specific.

Noldorin
As I said before its going to be used in a calibration program, so there will be lots of devices that have similar but not identical shapes for their error function. I do not know exactly what the shape looks like, because there is a lot of data that I need to get. both x and y are between 0 and 65535, and it takes a couple of seconds to collect one piece of data.
Tim
@Tim: Fair enough. Could you perhaps give us the form of this error function however? I'm not terribly optimistic about success here, if a request takes a matter of seconds!
Noldorin
This is basically what happens. I apologize if its a little ambiguous. double Error(x,y) { SetDeviceParams(x,y); double a = QueryParamA(); double b = QueryParamB(); double c = QueryParamC(); double _fReturnable = 0 if(a>=A_desired) { _fReturnable+=(A_desired-a)*(A_desired-a); } if(b>=B_desired) { _fReturnable+=(B_desired-b)*(B_desired-b); } if(c>=C_desired) { _fReturnable+=(C_desired-c)*(C_desired-c); } return Math.sqrt(_fReturnable) }
Tim
+1 to avoid being trapped in a local minimum, try running gradient descent a few times from random starting places
Gabe Moothart
@Gabe: Yeah, exactly. Unless you have a very good idea of where the minimum is near, you need to run the algorithm from a range of starting points.
Noldorin
+1  A: 

Jon Skeet's answer is correct. You really do need information about f and it's derivatives even if f is everywhere continuous.

The easiest way to appreciate the difficulties of what you ask(minimization of f at integer values only) is just to think about an f: R->R (f is a real valued function of the reals) of one variable that makes large excursions between individual integers. You can easily construct such a function so that there is NO correllation between the local minimums on the real line and the minimums at the integers as well as having no relationship to the first derivative.

For an arbitrary function I see no way except brute force.

Based on the testing that I have done, the gradient is small everywhere, so the function does not change much between integer values, but I can't predict in which direction it will change. Brute force won't work, because it takes so long to get a single piece of data
Tim
so are you now saying that besides the problem of looking at only integral values you don't event know f and you are only going to sample a subset of the {(x,y)} and try to draw conclusions from a limited subset?
@pgast Unfortunately, that's pretty much what I am saying. I have enough data that I have a good guess of a starting point, but that is about it. The good news is that I don't necessarily care about "the best" solution, as long as the solution I get meets spec
Tim
If this is a life dependent application then I would start looking for another job, as what you are being asked to do would be negligence of the first order.If not then I would probably brute force it with an intelligent choice of subset. What is the limit, nsamples, on the number of samples you can take? I would try to overlay a grid on 64Kx64 such that the number of grid points is nsamples, sample at the grid points and choose the minimum.
If you have real knowledge of the function then the initial grid may have less points say nsamples-m, and you and you can use the last m sampleings to brute force around the minimum of the first (nsamples-m) samples If I understand you you are searching for an (x,y) parameter pair in order to set up a device so that the error is small
A: 

Sorry the formatting was so bad previously. Here's an example of the error function

double Error(x,y)
{
SetDeviceParams(x,y);
double a = QueryParamA();
double b = QueryParamB();
double c = QueryParamC();
double _fReturnable = 0;
if(a>=A_desired)
{
  _fReturnable+=(A_desired-a)*(A_desired-a);
}
if(b>=B_desired)
{
 _fReturnable+=(B_desired-b)*(B_desired-b);
}
if(c>=C_desired)
{
  _fReturnable+=(C_desired-c)*(C_desired-c);
}
return Math.sqrt(_fReturnable)
}
Tim
Tim, edit your QUESTION to post this.
Daniel
Ah, well, I have done it for you. The formatting ended up different, as I foolishly copied from the displayed text instead of the editting text.
Daniel
Done. Sorry about that
Tim
+1  A: 

So let's look at your problem in math-speak. This is all assuming I understand your problem fully. Feel free to correct me if I am mistaken.

we want to minimize the following:

\sqrt((a-a_desired)^2 + (b-b_desired)^2 + (c-c_desired)^2)

or in other notation ||Pos(x - x_desired)||_2

where x = (a,b,c) and Pos(y) = max(y, 0) means we want the "positive part"(this accounts for your if statements). Finally, we wish to restrict ourself to solutions where x is integer valued.

Unlike the above posters, I don't think genetic algorithms are what you want at all.
In fact, I think the solution is much easier (assuming I am understanding your problem).

1) Run any optimization routine on the function above. THis will give you the solution x^* = (a^*, b^*,c^*). As this function is increasing with respect to the variables, the best integer solution you can hope for is (ceil(a^*),ceil(b^*),ceil(c^*)).

Now you say that your function is possibly hard to evaluate. There exist tools for this which are not based on heuristics. The go under the name Derivative-Free Optimization. People use these tools to optimize objective based on simulations (I have even heard of a case where the objective function is based on crop crowing yields!)

Each of these methods have different properties, but in general they attempt to minimize not only the objective, but the number of objective function evaluations.

SplittingField
+1 for derivative-free optimization, but the math-y restatement could use an explicit mention of the fact that "x" is a function, and perhaps rename "x" so it doesn't collide with the variables from the posted source.
outis
x is not a function, just the collection of the variables a,b,c into a single vector.
SplittingField
Tim doesn't want to minimize `Err_A(x) = ||Pos(x - A)||₂`, but rather `Err_A(Dev(u))`, where `Dev(u):Z² -> R³`. If the soln. is in `x`, it doesn't need to be in integers. Furthermore, `ceil·x'` might not be a valid soln. in `x`, as there may not be a `u'` such that `ceil·x'=Dev(u')`. For the extensions of `Dev` to some `Dev'(u):R² -> R³` I can imagine (terraces, meshes), solns. in `u` would be at integer values. If an exotic `Dev'(u)` had a minimum at `u'∉Z²`, elements of `{ceil, floor}² · u'` may not be solutions.
outis
Interesting... I need to think about this some more in a bit, but clearly I am m.isunderstanding something in his function def. Still, I think there are other non-heuristic solutions which I know are used in industry for this type of problem... an exampls is MADS.
SplittingField
`Dev` represents the device. It's the first 4 lines of his `Error(x,y)` function.
outis