views:

79

answers:

1

I'd like to estimate the number of leaves in a large tree structure for which I can't visit every node exhaustively. Is this algorithm appropriate? Does it have a name? Also, please pedant if I am using any terms improperly.

sum_trials = 0
num_trials = 0
WHILE time_is_not_up
    bits = 0
    ptr = tree.root
    WHILE count(ptr.children) > 0
         bits += log2(count(ptr.children))
         ptr = ptr.children[rand()%count(ptr.children)]
    sum_trials += bits
    num_trials++
estimated_tree_size = 2^(sum_trials/num_trials)
+3  A: 

This might be possible if you can make some safe assumptions about your tree (such as: is it balanced?) and its usage (is there a safe assumption about how many leaves will be children of the same node?). Better yet would be if you maintain a running tally (counter) everytime you add/remove a leaf node. Then you just access your counter variable in a single operation.

FrustratedWithFormsDesigner
I cannot assume the tree is balanced. But I can put an upper bound on depth. Does this help? This tree will be huge, for example representing all moves in a perfect information game.
Full Decent
Hm that might give you a worst-case estimate for number of leaves, but to get closer, you need to know more. Is there a way to know/estimate how many branches actually reach maximum depth?
FrustratedWithFormsDesigner
I don't know how to estimate how many branches reach maximum depth, but those are actually the only ones I'm interested in counting. I'm going to continue this topic with additional questions.
Full Decent