ansaurus

Question

Answer 1

+5 A:

Unless your tree is really large or you have really high (real) requirements for speed, I would choose the recursive method. Easier to read, easier to code.

rebra 2008-11-26 08:32:05

I am actually traversing a scene graph of a 3D application, so the speed is an issue, but also these type of generators makes it easier to get the code in one place for different kind of objects, the code gets messier when you need filtering.

M. Utku ALTINKAYA 2008-11-26 08:41:12

Answer 2

+3 A:

If you have a lot of RAM and the tree doesn't change often, you can cache the result of the call:

def children(self):
    if self._children_cache is not None:
        return self._children_cache
    # Put your code into collectChildren()
    self._children_cache = self.collectChildren()
    return self._children_cache

Whenever the tree changes, set the cache to None. In this case, using recursive calls might be more effective since the results will accumulate faster.

Aaron Digulla 2008-11-26 08:33:07

That is a good idea, also the generator can be changed to use the cache. if e has a children_cache than you iterate over the nodes instad of pushing it into stack. :)

M. Utku ALTINKAYA 2008-11-26 08:46:43

Answer 3

+4 A:

I'm not sure if you can reduce the overhead much on a full in-order traversal of a tree, if you use recursion the call stack will grow some, otherwise you must manually use a stack to push references of the children while visiting each node. Which way is fastest and uses less memory, depends on the expensiveness of the call stack vs. a normal stack. (I would guess the callstack is faster since it should be optimized for its use, and recursion is much easier to implement)

If you don't care about the order you visit the nodes, some implementations of trees is actually stored in a dynamic array or linked list or stack wich you can traverse linearly if you don't care about the order it's traversed.

But why is it important to have a fast traversal anyway? Trees are good for searching, arrays/linked lists is good for full traversal. If you often need full in-order traversal but few searches and insertions/deletions, an ordered linked list might be best, if searching is what you do most you use a tree. If the data is really massive, so that memory overhead may render recursion impossible, you should use a database.

Stein G. Strindhaug 2008-11-26 09:01:37

Answer 4

+1 A:

I've written iterative tree-traversal code in the past: it's very ugly, and not fast, unless you know exactly how many children not only each subtree will have, but how many levels there are.

warren 2008-11-26 09:03:35

Answer 5

A:

I don't know too much about Python internals of function calls, but I really can't imagine that your code snippet is faster than recursively traversing the tree.

The call stack (used for function calls, including recursive ones) is typically very fast. Going to the next object will only cost you a single function call. But in your snippet - where you use a stack object, going to the next object will cost you a stack.append (possibly allocating memory on heap), a stack.push (possibly freeing memory from heap), and a yield.

The main problem with recursive calls is that you might blow the stack if your tree gets too deep. This isn't likely to happen.

cadabra 2008-11-26 09:06:34

I have posted some simple test data.

M. Utku ALTINKAYA 2008-11-26 09:14:22

Function calls are actually pretty expensive. Each call must allocate a frame object (also on the heap), populate the parameters and then invoke it. Compare doing that N times for N children vs a single append of N items and you'll see why the stack method is faster.

Brian 2008-11-26 10:12:06

(continued) Yields are also much quicker than function calls, as they don't create new frames - the existing frame object is reused.

Brian 2008-11-26 10:13:08

Answer 6

+4 A:

Recursive function calls are not incredibly inefficient, that is an old programming myth. (If they're badly implemented, they may incur a larger overhead than necessary, but calling them "incredibly inefficient" is plain wrong.)

Remember: don't optimize prematurely, and never optimize without benchmarking first.

JesperE 2008-11-26 10:31:15

Answer 7

+5 A:

I can't think of any big algorithmic improvements, but a simple microoptimisation you can make is to bind frequently called methods (such as stack.append / stack.pop) to locals (this saves a dictionary lookup)

def children(self):
    stack = [self.entities]
    push = stack.append
    pop = stack.pop
    while stack: 
        for e in pop():
            yield e
            if e.entities:
                push(e.entities)

This gives a small (~15%) speedup by my tests (using 100 traversals of an 8-deep tree with 4 children at each node gives me the below timings:)

children     :  5.53942348004
children_bind:  4.77636131253

Not huge, but worth doing if speed is important.

Brian 2008-11-26 10:33:04

I have to admid my former in application profiling results were sharper, maybe becouse of the tree structure or the methods called. But it is not insignificant either, also the ability to cache can lead to pretty interesting optimizations.

M. Utku ALTINKAYA 2008-11-26 10:42:59

Answer 8

A:

Here's a pair of small corrections.

def children(self):
    stack = [self.entities]
    for e in stack:
        yield e
        if e.entities:
            stack.extend(e.entities)

I actually think the generator, using append, isn't visiting all the nodes. I think you mean to extend the stack with all entities, not append a simple list of entities to the stack.

Also, when the for loop terminates, the while loop in your original example will also terminate because there's no change to the empty stack after the for loop.

S.Lott 2008-11-26 11:08:40

Well, I did used a list of lists to prevent updating the list object I am interating over, I thought it is undefined to do that ?

M. Utku ALTINKAYA 2008-11-26 15:31:36

You might be thinking of Java, where messing with the structure confuses the iterator.

S.Lott 2008-11-26 16:24:07

ok, I have tested, it works, but it seems a little slower, becouse of the extend I suppose. recursive: 1.96200013161generator: 1.68400001526generator_new: 1.74000000954

M. Utku ALTINKAYA 2008-11-26 18:05:49

At this point, it seems like you should close this question and start again with the corrected algorithms and current timing. These times are more what I'd expect -- relatively minor differences.

S.Lott 2008-11-26 18:41:38

Hello, as I mentioned in my comments, It changes bu the tree structure, I have used low max_depth in these tests, so difference is smaller. Besides, the topic also stands for caching, which is not possible using recursive way, we also did not discuss searching.

M. Utku ALTINKAYA 2008-11-27 03:37:20

While I do not want to get into other sub topics, I think this question is still helpful to the others.

M. Utku ALTINKAYA 2008-11-27 03:38:21

I'm not sure it's useful because your initial code sample (with append) and your initial times (based on append) are actually wrong. That algorithm doesn't really work correctly. I think posting a wrong algorithm like that is confusing and maybe a bad policy.

S.Lott 2008-11-27 15:38:58

ansaurus

tags:

views:

answers:

How do you iterate over a tree ?

related questions