views:

87

answers:

4

I have a generator that yields nodes from a Directed Acyclic Graph (DAG), depth first:

def depth_first_search(self):
    yield self, 0 # root
    for child in self.get_child_nodes():
        for node, depth in child.depth_first_search():
            yield node, depth+1

I can iterate over the nodes like this

for node, depth in graph.depth_first_search():
    # do something

I would like to be able to tell the generator, from the for loop, to stop from going deeper in the graph if some condition is met.

I came up with the following solution, that use an external function.

def depth_first_search(self, stop_crit=lambda n,d: False):
    yield self, 0 # root
    for child in self.get_child_nodes():
        for node, depth in child.depth_first_search():
            yield node, depth+1
            if stop_crit(node, depth): break

This solution forces me to declare variables I need before stop_crit is defined so they can be accessed from it.

In Ruby, yield returns the last expression from the block so this could conveniently be used to tell the generator to continue or stop.

What is the best way to achieve this functionality in Python?

+2  A: 

Checkout PEP 342

bassfriend
+1  A: 

Coroutines (bassfriend mentioned them) are tricky for the uninitiated, so here is one. I added some test code so you can see how this really works.

class Node(object):
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

    # the producing coroutine, it sends data to the consumer
    def depth_first_search(self, consumer, depth=0):
        """ `consumer` is a started coroutine that yields True to continue a branch """
        if consumer.send((self, depth)): # continue this branch?
            for child in self.get_child_nodes():
                child.depth_first_search(consumer, depth+1)

    def get_child_nodes(self):
        for node in (self.left, self.right):
            if node is not None:
                yield node

    def __repr__(self):
        return "Node(val=%d)" % self.val

def coroutine(func):
    """ decorator that declares `func` as a coroutine and starts it """
    def starter(*args, **kwargs):
        co = func(*args, **kwargs)
        next(co) # corotines need to be started/advanced to the first yield
        return co
    return starter

# the consumer takes data and yields if it wants to continue
@coroutine
def consumer( continue_branch=lambda n,d:True ):
    node, depth = (yield True) # first node
    while True:
        print node, depth # do stuff
        node, depth = (yield continue_branch(node, depth))


# testing
tree = Node(5, Node(2, Node(3), Node(4)), Node(6, Node(7), Node(8))) # 
cons = consumer()
tree.depth_first_search(cons)# yields all

print
stopper = consumer(lambda n,d: n.val > 2) # skips the children of Node 2
tree.depth_first_search(stopper)

The trick is that if you keep the roles of your functions, where depth_first_search yields nodes, you end up with a horrible mess ... instead, the nodes are produced and sent to the consumer.

Python's support for coroutines is a bit awkward (@coroutine to the rescue). There is a pretty nice tutorial for Python and plenty of resources for languages that depend on coroutines, such as Lua. In any way, it's a very cool concept worth exploring :-)

THC4k
+1  A: 

Normally you don't tell your iterable to check conditions, you do that in the body of your loop:

for node, depth in graph.depth_first_search():
    if node meets condition:
        # do something with node 
        break
# do something with node, its still referencing what you breaked on

This code has the advantage of not surprising nor confusing anyone.

Will
+1  A: 

Naive solution:

def depth_first_search(self):
    yield self, 0 # root
    for child in self.get_child_nodes():
        for node, depth in child.depth_first_search():
            if(yield node, depth+1):
                yield None # for .send
                return

You can call it normally still, but you have to save the iterable to abort:

it = graph.depth_first_search()
for node, depth in it: #this is why there should be pronouns for loop iterables
    stuff(node,depth)
    if quit: it.send(1) 
    # it.next() should raise StopIteration on the next for iteration

I think this works right now.

David X
This is actually one common way to mess it up: `it.send(1)` sends the 1, as intended, but also continues `it` to the next item - which you would just ignore. Basically you'd have to duplicate the whole loop again in the quit block, again and again. In fact you can never mix sending and iterating over the same thing (you'd treat the thing as a generator and a coroutine at the same time, which makes no sense at all).
THC4k
@THC4k: thanks, missed the `StopIteration`, fixing...
David X
Nicely fixed, but you want `break` instead of `return` in `depth_first_search`
THC4k
If we want to abort the generator completly (which was what I was coding for) we need `return` since there are 2 `for` loops.
David X
Thank you very much for your answers, David X and THC4k. Coroutines are indeed what I needed. Since it.send(1) is waiting for an answer, an additional yield is indeed needed... This solution does a it.next() (implicit in the for loop) and a it.send(1) when the condition is met. It would be nice to use it.send(1) or it.send(0) instead but unfortunately you loose the ability to use a for loop and are forced to use a while loop. (Note: a break is indeed needed instead of a return if you want to explore the remaining nodes)
Mathieu
@Mathieu, oh, so thats what the `break` is for; makes sense now, thanks. (Also `it.next()` is equivalent to `it.send(None)` if memory serves.)
David X