views:

373

answers:

7

Hi,

I'm trying to develop a genetic algorithm that will find the most efficient way to connect a given number of nodes at specified locations.

All the nodes on the network must be able to connect to the server node and there must be no cycles within the network. It's basically a tree.

I have a function that can measure the "fitness" of any given network layout. What's stopping me is that I can't think of a crossover function that would take 2 network structures (parents) and somehow mix them to create offspring that would meet the above conditions.

Any ideas?

Clarification: The nodes each have a fixed x,y coordiante position. Only the routes between them can be altered.

+1  A: 

Sounds like you need to create a Minimum Spanning Tree network. I know that this doesn't really answer your genetic algorithm question, but this is quite a well understood problem. Two classical methods are Prim and Krustal. Maybe these algorithms and the methods they use to select edges to connect might give you some clues. Maybe the genes don't describe the network but instead the likelihood of connecting nodes via particular edges? Or a way of picking a node to connect in next?

Or just check out someone who's done this before, or this.

Trevor Tippins
+1  A: 

Let me begin by answering your question with a question :

How does the fitness function behave if you create a network layout that violates the 'no cycle' rule and the 'connect to server' rule?

If it simply punishes the given network layout via a poor fitness score, you don't need to do anything special except take two network layouts and cross them over, 1/2 from layout A, 1/2 from layout B. That's a very basic cross over function, and it should work.

If however, you are responsible for constructing a valid layout and cannot rely on invalid layouts simply being weeded out, you'll need to do more work.

Amir Afghani
+3  A: 

Amir- I think the idea is that every generated tree will contain the same set of nodes, arranged in a different order.

Perhaps rather than using a crossover-based genetic algorithm, you'd be better off using a less biologically inspired hill-climbing algorithm? Define a set of swaps (trading children between nodes, for example) to act as possible mutations, and then iteratively mutate and check against your fitness function. As is the case with all searches of this kind, you're vulnerable to getting stuck in local maxima, so many runs from different starting positions is a good idea.

JohnE
I assumed the nodes were always the same, the way they were connected varied. I can see now that this was probably the wrong way to think about it.
Amir Afghani
Also, you should use the comments for the first part of your answer :)
Amir Afghani
Haha- I would've, but I still need more "reputation points" to comment on other people's solutions. I will in the future. :)
JohnE
Well in that case I'll up your answer ;)
Amir Afghani
A: 

An idea to try: encode nodes' positions in a metric space (e.g. 3-dimensional euclidean space). There are no "incorrect" assignments, so crossover is never destructive. From such an assignment you can always build a nearest neighbor tree, a minimum spanning tree or similar.

This is just example of a more general idea: do not encode the tree directly, encode some information from which a tree can always be constructed. The tricky part is to do it in such a way that the child trees keep important properties of the parents.

Rafał Dowgird
A: 

The purpose of crossover in genetic algorithms is to potentially mix good partial solutions from one parent with those of another. One way to think about partial solutions in this case may be subtrees of closely connected nodes. If your fitness function is fairly smooth in regards to small changes to localized parts of the overall tree, this may be a useful way to think about crossover.

Given this, one possible form of crossover would be the following:

Start with two parent trees, P1, and P2. Select two nodes randomly (possibly with some kind of enforcement on minimum distance between the nodes), N1 and N2.

On a node-by-node basis, "grow" a tree C1 outwards from N1 according to the linkages in P1, while simultaneously growing another tree outwards from N2 starting with P2. Do not add the same node to both trees - keep the sets of nodes entirely disjoint. Continue until all nodes have been added to either C1 or C2. This gives us the "traits" from each parent to recombine, in a form guaranteed to be acyclic.

Recombination is accomplished by adding an additional link, from C1 to C2, to create the new child C. As for which link to choose, this can be randomly selected (either uniformly or according to some distribution), or it could be selected by a greedy algorithm (based on some heuristic, or the overall fitness of the new tree C).

Justin W
There were a lot of good possible solutions here, I wish we'd been given more feedback from the OP.
Justin W
A: 

You could check on the crossover operators, which make sure that you have no repeating nodes in the child chromosomes. Couple of those crossover operators are the Order Crossover (OX) and the Edge Crossover operators. Such crossover operators are also helpful in solving TSP using GAs.

And after that you will have to check if you are getting any cycles or not. If yes, generate a new pair of chromosome. This is brute force of course.

Amit
A: 

There was a paper in one of the early conferences that proposed the following algorithm for Traveling Salesman, which I have adapted for several graph problems with success:

Across the entire POPULATION, calculate and sort the nodes by descending number of connections (in other words, if N0 is connected in some individuals to N1, N2, N3, it has 3 connections, if N1 is always connected to N4, it has only 1).

Initially, take the node with the highest count. Call this the current_gene_node. (Say, N0)

LOOP: Add current_gene_node to your offspring.

Remove that node from the lists of connections. (No cycles, so remove N0 from further consideration.)

If current_gene_node has no connections in the population, choose a random unchosen node in the population (mutation)

Else, from the list of connections for that node, do a lottery selection based on the prevalence of connections across the population (If current_gene_node = N0, and connections N0 are, say, N1 = 50%, N2 = 30%, N3 = 20% -- N1 has a 50% chance of being next current_gene_node).

Go to LOOP until all nodes connected

It's not really genetic in the sense of choosing directly from 2 parents, but it follows the same mathematical pressure of selecting based on population prevalence. So it's "genetic enough" for me and for me it's worked pretty well :-)

Larry OBrien