views:

177

answers:

3

I have a very simple binary tree structure, something like:

struct nmbintree_s {
    unsigned int size;
    int (*cmp)(const void *e1, const void *e2);
    void (*destructor)(void *data);
    nmbintree_node *root;
};

struct nmbintree_node_s {
    void *data;
    struct nmbintree_node_s *right;
    struct nmbintree_node_s *left;
};

Sometimes i need to extract a 'tree' from another and i need to get the size to the 'extracted tree' in order to update the size of the initial 'tree' .

I was thinking on two approaches:

1) Using a recursive function, something like:

unsigned int nmbintree_size(struct nmbintree_node* node) {
  if (node==NULL) {
    return(0);
  } 
  return( nmbintree_size(node->left) + nmbintree_size(node->right) + 1 );
} 

2) A preorder / inorder / postorder traversal done in an iterative way (using stack / queue) + counting the nodes.

What approach do you think is more 'memory failure proof' / performant ?

Any other suggestions / tips ?

NOTE: I am probably going to use this implementation in the future for small projects of mine. So I don't want to unexpectedly fail :).

+5  A: 

Just use a recursive function. It's simple to implement this way and there is no need to make it more compilcated.

If you were doing it "manually" you'd basically end up implementing the same thing, just that you wouldn't use the system call stack for temporary variables but your own stack. Usually this won't have any advantages outweighing the more complicated code.

If you later find out that a substantial time in your program is spend calculating the sizes of trees (which probably won't happen) you can still start to profile things and try how a manual implementation performs. But then it might also better to do algorithmic improvements like already keeping track of the changes in size during the extraction process.

sth
I cannot keep track of changes on extractions. :( The problem is I want to make things right from the beginning. Actually I want to learn 'to squeeze the code for performance', think of this as an exercise.
Andrei Ciobanu
Agreed, keep the algorithms simple. "KISS" principle.
Paul Nathan
+2  A: 

If your "very simple" binary tree isn't balanced, then the recursive option is scary, because of the unconstrained recursion depth. The iterative traversals have the same time problem, but at least the stack/queue is under your control, so you needn't crash. In fact, with flags and an extra pointer in each node and exclusive access, you can iterate over your tree without any stack/queue at all.

Another option is for each node to store the size of the sub-tree below it. This means that whenever you add or remove something, you have to track all the way up to the root updating all the sizes. So again if the tree isn't balanced that's a hefty operation.

If the tree is balanced, though, then it isn't very deep. All options are failure-proof, and performance is estimated by measurement :-) But based on your tree node struct, either it's not balanced or else you're playing silly games with flags in the least significant bits of pointers...

There might not be much point being very clever with this. For many practical uses of a binary tree (in particular if it's a binary search tree), you realise sooner rather than later that you want it to be balanced. So save your energy for when you reach that point :-)

Steve Jessop
"In fact, with flags in the node and exclusive access, you can iterate over your tree without any stack/queue at all." Can you please be more specific.
Andrei Ciobanu
Sure, put two flags in each node, one meaning "I have checked the left subtree", and one meaning "I have checked the right subtree". Initialise both to false when nodes are created. Starting at the root, for each node: If "left" is false, set to true and go left; else if "right" is false, set to true and go right; else both are true, so set them both false and go up to the parent. Sorry, just realised your node struct doesn't have a pointer-to-parent. You need one of those too.
Steve Jessop
Thank you for your suggestion.
Andrei Ciobanu
+1  A: 

How big is this tree, and how often do you need to know its size? As sth said, the recursive function is the simplest and probably the fastest.

If the tree is like 10^3 nodes, and you change it 10^3 times per second, then you could just keep an external count, which you decrement when you remove a node, and increment when you add one. Other than that, simple is best.

Personally, I don't like any solution that requires decorating the nodes with extra information like counts and "up" pointers (although sometimes I do it). Any extra data like that makes the structure denormalized, so changing it involves extra code and extra chances for errors.

Mike Dunlavey