views:

352

answers:

2

I'm trying to code a simple genetic programming utility in python. But right now I'm stuck at the crossover/mate function for my trees. The trees are built by nested lists and look something like this:

# f = internal node (a function), c = leaf node (a constant)
tree1 = [f, [f, [f, c, c], [f, c, c]], [f, [f, c, c], [f, c, c]]]
tree2 = [f, [f, [f, c, c], c], [f, [f, c, c], c]]

I want to randomly select a point in each tree to split at and then I want one part from each tree to be combined into a new tree. There is also a max depth that shouldn't be exceeded so the selects can't really take place anywhere in the tree as it might create a too large tree. Below is an example on how it should work:

# f:n, where n is the number of arguments the function take
#               + split here  
tree1 = [f:2, [f:3, a, a, a], a]
#                            + split here
tree2 = [f:2, [f:2, a, a], [f:1, a]

tree_child1 = [f:2, [f:1, a], a]
tree_child2 = [f:2, [f:2, a, a], [f:3, a, a, a]]

I have no idea (at the moment) on how to solve this. Any tips or solutions are more than welcome!

(Added my parse function as it might help someone to understand the structure better.)

# My recursive code to parse the tree.
def parse(self, node=None):
    if not node:
        node = self.root

    if isinstance(node, list):
        function = node[0]
        res = []
        for child in node[1:function.arity+1]:
            res.append(self.parse(child))
        value = function.parse(*res) # function
    else:
        value = node.parse() # constant
    return value
A: 

If you store in each internal node a count of the children in each branch, then you could pick a split point by generating a random number from 0 to 1+total children. If the answer is 1, split at that node, otherwise use the number to figure out which subtree to descend to, and repeat the process.

John Fouhy
+1  A: 

I ended up implementing most of this as an exercise.

First, find the number of possible locations to split: the number of non-function nodes.

def count(obj):
    total = 0
    for o in obj[1:]:
        # Add the node itself.
        total += 1

        if isinstance(o, list):
            total += count(o)
    return total

Then, a helper: given an index in the above range, figure out where it is.

def find_idx(tree, idx):
    """
    Return the node containing the idx'th function parameter, and the index of that
    parameter.  If the tree contains fewer than idx parameters, return (None, None).
    """
    if not isinstance(idx, list):
        # Stash this in a list, so recursive calls share the same value.
        idx = [idx]

    for i, o in enumerate(tree):
        # Skip the function itself.
        if i == 0:
            continue

        if idx[0] == 0:
            return tree, i

        idx[0] -= 1
        if isinstance(o, list):
            container, result_index = find_idx(o, idx)
            if container is not None:
                return container, result_index

    return None, None

Doing the swap is pretty simple now:

def random_swap(tree1, tree2):
    from random import randrange
    pos_in_1 = randrange(0, count(tree1))
    pos_in_2 = randrange(0, count(tree2))

    parent1, idx1 = find_idx(tree1, pos_in_1)
    parent2, idx2 = find_idx(tree2, pos_in_2)

    # Swap:
    parent1[idx1], parent2[idx2] = parent2[idx2], parent1[idx1]

c = 1
tree1 = ["f:2", c, ["f:1", c]]
tree2 = ["f:2", ["f:2", ["f:2", c, c], ["f:2", c, c]], ["f:3", ["f:4", c, c, c, c], ["f:2", c, c], c]]

while True:
    random_swap(tree1, tree2)
    print tree1
    print tree2

This doesn't implement a max depth, but it's a start.

This will also never replace the root node, where a node in tree1 becomes the new tree2 and all of tree2 becomes a node in tree1. A workaround would be to wrap the whole thing in eg. [lambda a: a, tree], so editable nodes always have a parent node.

This isn't very efficient. Maintaining node counts could make it faster, but then you'd need to store a reference to the parent, too, in order to update the counts efficiently. If you go that route, you'll really want to find or implement a real tree class.

Glenn Maynard