What is the best way to visit all the nodes of a linked tree (all nodes have references to parent and all children, root nodes have null as parent), so that no node is visited before any of its ancestors? Brownie points for non-recursive.
Build up a list of nodes at the root (level 0), iterate over each node in turn and look for direct children (whose parent node is the node we are currently looking from) (level 1), when finished with level 0 move on to iterating level 1, and so on until you have no remaining unvisited nodes.
Use a set of nodes. Put the root in the set to start. Then in a loop, pull a node out of the set, visit it, then put its children in the set. When the set is empty, you are done.
In pseudocode:
currentList = list( root )
nextList = list()
while currentList.count > 0:
foreach node in currentList:
nextList.add(node.children)
currentList = nextList
You're looking for a preorder traversal. I think you can do it non-recursively with a queue:. In pseudocode:
Create an empty queue, then push the root node.
while nonempty(q)
node = pop(q)
visit(node)
foreach child(node)
push(q,child)
Pseudo code:
NodesToVisit = some stack or some list
NodesToVisit.Push(RootNode)
While NodesToVisit.Length > 0
{
CurNode = NodesToVisit.Pop()
For each CHild C in CurNode
NodesToVIsit.Push(C)
Visit CurNode (i.e. do whatever it needs to have done
}
Edit: Recursive or not?
To be technically correct, and as pointed out by AndreyT and others in this post, this approach is a form of a recursive algorithm, whereby an explicitly managed stack is used in lieu of the CPU stack and where the recursion takes place at the level of the While loop. This said, it differs from a recursive implementation per se in a couple of subtle yet significant ways:
- Only the "variables" are pushed onto the stack; there is no "stack frame" and associated return address on the stack, the only "return address" is implicit to the while loop, and there is but one instance of it.
- The "stack" could be used as a list whereby the next "frame" could be taken anywhere in the list, without braking the logic in any way.
If you start at the root node, and only visit the parents/children of nodes you have already visited, there is no way to traverse the tree such that you visit a node before visiting its ancestors.
Any sort of traversal, depth first (recursive/stack based), breadth first (queue based), depth-limited, or just pulling them out of an unordered set, will work.
The "best" method depends on the tree. Breadth first would work well for a very tall tree with few branches. Depth first would work well for trees with many branches.
Since the nodes actually have pointers to their parents, there is also a constant-memory algorithm, but it is much slower.
If you have links to all children and to the parent as well, then non-recursive algorithm is rather trivial. Just forget that you have a tree. Think of it is a labirynth where each parent-child link is an ordinary bi-directional corridor from one junction to the other. All you need to do to walk the entire labyrinth is to turn into the next corridor on the left at each junction. (Alternatively, think of it as walking the labyrinth with your left hand always touching the wall on the left side). If you start from the root junction (and move in any direction), you'll walk the entire tree always visiting parents before children. Each "corridor" in this case will be travelled twice (in one direction and in the other), and each "junction" (node) will be visited as many times as many "corridors" join it.
I would disagree with breadth first search as space complexity is often the bane of that specific search algorithm. Possibly using the iterative deepening algorithm is a better alternative for this type of use, and it covers the same type of traversal as breadth first search. There are minor differences in dealing with the fringe from breadth-first search, it shouldn't be too hard to (pseudo) code out though.
Reference: http://en.wikipedia.org/wiki/Iterative%5Fdeepening
Here's a truly non-recursive approach: no stack, constant space. This Python code assumes that each node contains a list of children, and that the node objects do not define equality, so that the 'index' function is comparing identities:
def walkTree(root, visit_func):
cur = root
nextChildIndex = 0
while True:
visit_func(cur)
while nextChildIndex >= len(cur.children) and cur is not root:
nextChildIndex = cur.parent.children.index(cur) + 1
cur = cur.parent
if nextChildIndex >= len(cur.children):
break
cur = cur.children[nextChildIndex]
nextChildIndex = 0
I'm sure it could be polished up a bit, made more concise and easier to read, but that's the gist.