views:

130

answers:

2

Hello, I'm facing a parameter selection problem, which I would like to solve using Genetic Algorithm (GA). I'm supposed to select not more than 4 parameters out of 3000 possible ones. Using the binary chromosome representation seems like a natural choice. The evaluation function punishes too many "selected" attributes and if the number of attributes is acceptable, it then evaluates the selection.

The problem is that in these sparse conditions the GA can hardly improve the population. Neither the average fitness cost, nor the fitness of the "worst" individual improves over the generations. All I see is slight (even tiny) improvement in the score of the best individual, which, I suppose, is a result of random sampling.

Encoding the problem using indices of the parameters doesn't work either. This is most probably, due to the fact that the chromosomes are directional, while the selection problem isn't (i.e. chromosomes [1, 2, 3, 4]; [4, 3, 2, 1]; [3, 2, 4, 1] etc. are identical)

What problem representation would you suggest?

P.S If this matters, I use PyEvolve.

+2  A: 

I'm not familiar with with PyEvolve, but from what I can recall about Genetic algorithms, you are concerned with 4 steps,

  1. Chormosome Evalutation (you probably already have this figured out)
  2. Chromosome Initialization
  3. Chromosome Crossover
  4. Chromosome Mutation

I think you can do this with lists just fine. You'll need to overload some operators, but it looks like PyEvolve lets you do this A simple thing would be to keep the list representation, just sort them numerically before you return the chromosome.

I would need to know more about your problem, but here are my suggestions. Since you have a variable number of parameters, you need to come up with some sort of probability distribution for number of parameters in a chromosome. I'll assume a uniform random on 1,2,3,4 here, but you can try something else if you like that better. I'm going to call this distribution P_n.

  1. Initialization. Seed your population with (at least) 3000 chromosomes. Call these c_1,...,c_3000. Draw n_j from P_n. put j into c_j. Choose the remaining n_j - 1 parameters with a uniform random distribution from the remaining parameters.
  2. Crossover. Lets suppose that we have two chromosomes. C_1 and C_2. We're going to create (and return) chromosome C_3. Choose n_3 from {n_1, n_2} each with probability 1/2. Now put the parameters of C_1 and C_2 into one list (and unique them, so if both C_1 and C_2 contain parameter 1, it is only in the list once). Draw n_3 parameters from the joint list and put them into chromosome C_3.
  3. Mutation. Given Chromosome C_1, draw n_1* from P_n. if n_1* is < n_1, randomly delete elements from C_1 until has n_1* elements. If n_1* = n_1 randomly choose 1 element from C_1 and replace it with a parameter randomly chosen from those not in C_1. If n_1* > n_1 randomly add elements to C_1 until is has size n_1*.

Now, there are many ways to go about this, so do what makes the most sense for your problem.

leif
Please correct:"Now put the parameters of C_1 and C_3 into one list" to :Now put the parameters of C_1 and C_2 into one list"(note C_2 instead of C_3)
bgbg
Right. Does it seem like you are getting decent exploration of the parameter space this way?
leif
Thank you, leif. Your suggestion worked like a charm for me. Is this method described (published) anywhere or is it your development?
bgbg
I made this up, but it's certainly based similar kinds of problems.
leif
can you please contact me: lists_boris< A T >gorelik.net
bgbg
A: 

I think a singular value decomposition (http://en.wikipedia.org/wiki/Singular_value_decomposition) might be more appropriate here on account of the limit on the number of parameters.

Dave