views:

67

answers:

1

Hi, I would like to implement a GA-based feature selection program for an SGI UV system (cores=512, shared memory=4TB) which operates as thus: (Feature selection is the process by which the smallest subset of original features is found which makes discrimination between the output classes better than using the original set of features-or just the same-for the given data, e.g. data consists of : {Atmospheric Pressure, Temperature, myShoeSize} as independent variables and output is Rainfall, a possible outcome of feature selection would be {Atm.Pressure, Temperature}).

The GA keeps a pool of parents each of which represents different feature subsets. These parents need to be evaluated using a Support Vector Machine or any other Machine Learning Method (Neural Network etc.), So I want each parent to be sent to the next available cpu-core, evaluated using any program and the fitness sent back to the GA. The GA will therefore be responsible for everything except the evaluation (fitness) of each parent - instead the GA will send the parent to an available core and wait for fitness result. This is where the distributed-features of the method will be (So i don't want various GAs running on different cores, just one GA running on one core and spawning fitness evaluators in different cores).

In order to take advantage of the distributed-computing features of my hardware, I would like the GA to operate in an asynchronous mode whereas there are two groups of parents, those who had their fitness evaluated and those who are waiting for it. When there is a free core, the GA takes a parent from the pool of un-evaluated and sends it to the core. In the meantime the GA takes parents from the evaluated pool mutates them etc. crossover them and sends its children to the un-evaluated pool and so on.

So, my idea is to get an open source GA library and modify it a little bit as far as its evaluation function is concerned. If the library offers this 'asynchronous' mode then that will be good. In addition to all these, I would like the library to offer a lot of features e.g. cellularGA. Whatever comes out, will be open source too.

Does anyone have any suggestions? btw does anyone know of any references to publications about this 'asynchronous' mode - or do you see any disadvantages with that?

thanks for your time and answers,

Bliako

A: 

Hi.

Try JGAP. It is more about genetic programming, but has GA support, and it is opeensource, so you can modify it. And it has distributed computing support.

yoosiba