tags:

views:

106

answers:

3

I'm writing a program in C++ that uses genetic techniques to optimize an expression tree.

I'm trying to write a class Tree which has as a data member Node root. The node constructor generates a random tree of nodes with +,-,*,/ as nodes and the integers as leaves.

I've been working on this awhile, and I'm not yet clear on the best structure. Because I need to access any node in the tree in order to mutate or crossbreed the tree, I need to keep a dicionary of the Nodes. An array would do, but it seems that vector is the recommended container.

vector<Node> dict;

So the Tree class would contain a vector dict with all the nodes of the tree (or pointers to same), the root node of the tree, and a variable to hold a fitness measure for the tree.

class Tree
    {
    public:
        typedef vector<Node>dict;
        dict v;
        Node *root;
        float fitness;

        Tree(void);
        ~Tree();
    };

class Node
    {
    public:
        char *cargo;
        Node *parent;
        Node *left;
        Node *right;
        bool entry;
        dict v;
        Node(bool entry, int a_depth, dict v, Node *pparent = 0);
                };

Tree::Tree()
    {
     Node root(true,  tree_depth, v);
    };

There seems to be no good place to put typedef vector<Node>dict;, because if it goes in the definition of Tree, it doesn't know about Node, and will give an error saying so. I havn't been able to find a place to typedef it.

But I'm not even sure if a vector is the best container. The Nodes just need to be indexed sequentally. The container would need to grow as there could be 200 to 500 Nodes.

A: 

You may implement a list over nodes. Then, each node will have two additional pointers inside:

class Node{
...
Node* sequentialPrevious;
Node* sequentialNext;
...
}

And so will the tree:

class Tree{
...
Node* sequentialFirst;
Node* sequentialLast;
...
}

Than you will be albe to move bidirectionally over nodes just by jumping to sequentialFirst or sequentialLast and then iteratively to sequentialNext or sequentialPrevious. Of course, Node constructor and destructor must be properly implemented to keep those pointers up to date.

mbq
+1  A: 

I don't think the Node or the Tree are the first classes to write.

I'd start with Expression. In your case you need at least a BinaryExpression, as well as an expression with no subnodes (constants or variables). Each Binary expression should contain auto_ptr<Expression> lhs and auto_ptr<Expression> rhs.

You could then easily write a function to enumerate through the expression tree's members. If performance turns out to be relevant, you can cache the list of expressions in the tree, and invalidate it manually when you change the expression. Anything more advanced is likely to be slower and more error prone.

I don't see why an expression needs to know it's parent expression. It only makes life harder when you start editing expressions.

jdv
@jdv a node needs to know it's parent for crossbreeding when subtrees from two randomly selected nodes are swapped.
Peter Stewart
@Peter, what do you do with the node's parent when you're doing crossbreeding?
Lirik
@Lirik I got a great reply from alex martelli in this post when I was first figuring out the trees in python: http://stackoverflow.com/questions/1386493/python-manipulating-sub-treesThe node in question would be the root of one of two subtrees. So the parent would be changed to point to the other subtree and the other subtree's parent would be changed to point to the node in question. yiggidda.
Peter Stewart
@Peter, it seems a bit complicated... a simpler solution would be to select one node X, then select another node Y, then replace Y's left or right child with X. There is no need to know about a parent or maintain links back to the parent (it's an unnecessary complication and more code which needs to be executed).
Lirik
What a good idea. Selecting the parents of the subtree nodes, thanks.
Peter Stewart
+1  A: 

I think a standard Binary Tree should do... here is an example of a (binary) expression tree node:

const int NUMBER = 0,    // Values representing two kinds of nodes.
          OPERATOR = 1;

struct ExpNode {  // A node in an expression tree.

    int kind;        // Which type of node is this?
                     //   (Value is NUMBER or OPERATOR.)
    double number;   // The value in a node of type NUMBER.
    char op;         // The operator in a node of type OPERATOR.
    ExpNode *left;   // Pointers to subtrees,
    ExpNode *right;  //     in a node of type OPERATOR.

    ExpNode( double val ) {
          // Constructor for making a node of type NUMBER.
       kind = NUMBER;
       number = val;
    }

    ExpNode( char op, ExpNode *left, ExpNode *right ) {
          // Constructor for making a node of type OPERATOR.
       kind = OPERATOR;
       this->op = op;
       this->left = left;
       this->right = right;
    }

}; // end ExpNode

So when you're doing crossover or mutation and you want to select a random node you just do the following:

  1. Count the number of nodes in the tree (only need to do this ones in the constructor).
  2. Select a random index from 0 to the size of the tree.
  3. Visit each node and subtract 1 from the random index until you reach zero.
  4. Return the node when the index is 0.

In this case you don't need to know anything about the parent of the node. So mating/mutation should look like this:

select nodeX
select nodeY
    if( Rand(0,1) == 1 )
        nodeY->left = nodeX;
    else
        nodeY->right = nodeX;

And that should be it...

Lirik