views:

61

answers:

4

I have a set of n real numbers. I also have a set of functions,

f_1, f_2, ..., f_m.

Each of these functions takes a list of numbers as its argument. I also have a set of m ranges,

[l_1, u_1], [l_2, u_2], ..., [l_m, u_m].

I want to repeatedly choose a subset {r_1, r_2, ..., r_k} of k elements such that

l_i <= f_i({r_1, r_2, ..., r_k}) <= u_i for 1 <= i <= m.

Note that the functions are smooth. Changing one element in {r_1, r_2, ..., r_k} will not change f_i({r_1, r_2, ..., r_k}) by much. average and variance are two f_i that are commonly used.

These are the m constraints that I need to satisfy.

Moreover I want to do this so that the set of subsets I choose is uniformly distributed over the set of all subsets of size k that satisfy these m constraints. Not only that, but I want to do this in an efficient manner. How quickly it runs will depend on the density of solutions within the space of all possible solutions (if this is 0.0, then the algorithm can run forever). (Assume that f_i (for any i) can be computed in a constant amount of time.)

Note that n is large enough that I cannot brute-force the problem. That is, I cannot just iterate through all k-element subsets and find which ones satisfy the m constraints.

Is there a way to do this?

What sorts of techniques are commonly used for a CSP like this? Can someone point me in the direction of good books or articles that talk about problems like this (not just CSPs in general, but CSPs involving continuous, as opposed to discrete values)?

A: 

Given the problem as you've described it, you can pick from each range r_i uniformly and throw away any m-dimensional point that fails to meet the criterion. It will be uniformly distributed because the original is uniformly distributed and the set of subsets is a binary mask over the original.

Without knowing more about the shape of f, you can't make any guarantees about whether time is polynomial or not (or even have any idea of how to hit a spot that meets the constraint). After all, if f_1 = (x^2 + y^2 - 1) and f_2 = (1 - x^2 - y^2) and the constraints are f_1 < 0 and f_2 < 0, you can't satisfy this at all (and without access to the analytic form of the functions, you could never know for sure).

Rex Kerr
Funny how three of us answere within a minute --- 17 minutes after the question was posted.
James Curran
+1  A: 

Assuming you're looking to write your own application and use existing libraries to do this, there are choices in many languages, like Python-constraint, or Cream or Choco for Java, or CSP for C++. The way you've described the problem it sound like you're looking for a general purpose CSP solver. Are there any properties of your functions that may help reduce the complexity, such as being monotonic?

John Ellinwood
The functions are smooth. Changing one element in {r_1, r_2, ..., r_k} will not change f_i({r_1, r_2, ..., r_k}) by much.
Paul Reiners
@Paul - Mathematical smoothness doesn't help much. Do you know something about the first partial derivatives, and bounds on the magnitude of higher derivatives? That could help.
Rex Kerr
average and variance are two of the f_i that are commonly used. I have no idea what the first partial derivatives and bounds on the magnitudes of the higher derivatives are for those. We also use one or two more complicated statistical functions.
Paul Reiners
A: 

Given the information in your message, I'm not sure it can be done at all...

Consider:

  • numbers = {1....100}
  • m = 1 (keep it simple)
  • F1 = Average
  • L1 = 10
  • U1 = 50

Now, how many subset of {1...100} can you come up with that produces an average between 10 & 50?

James Curran
How quickly it runs will depend on the density of solutions within the space of all possible solutions (if this is 0.0, then the algorithm can run forever).
Paul Reiners
A: 

This looks like a very hard problem. For the simplest case with linear functions you could take a look at linear programming.

starblue
Unfortunately, they're not linear. And, yes, I agree that it's very hard!
Paul Reiners