views:

206

answers:

2

Can anyone point me to some resources on how to do this? I'm using networkx as my python library.

Thanks!

+2  A: 

This is based on Alex Martelli's answer, but it should work. It depends on the expression source_node.children yielding an iterable that will iterate over all the children of source_node. It also relies on there being a working way for the == operator to compare two nodes to see if they are the same. Using is may be a better choice. Apparently, in the library you're using, the syntax for getting an iterable over all the children is graph[source_node], so you will need to adjust the code accordingly.

def allpaths(source_node, sink_node):
    if source_node == sink_node: # Handle trivial case
        return frozenset([(source_node,)])
    else:
        result = set()
        for new_source in source_node.children:
            paths = allpaths(new_source, sink_node, memo_dict)
            for path in paths:
                path = (source_node,) + path
                result.add(path)
        result = frozenset(result)
        return result

My main concern is that this does a depth first search, it will waste effort when there are several paths from the source to a node that's a grandchild, great grandchild, etc all of source, but not necessarily a parent of sink. If it memoized the answer for a given source and sink node it would be possible to avoid the extra effort.

Here is an example of how that would work:

def allpaths(source_node, sink_node, memo_dict = {}):
    if source_node == sink_node: # Don't memoize trivial case
        return frozenset([(source_node,)])
    else:
        pair = (source_node, sink_node)
        if pair in memo_dict: # Is answer memoized already?
            return memo_dict[pair]
        else:
            result = set()
            for new_source in source_node.children:
                paths = allpaths(new_source, sink_node, memo_dict)
                for path in paths:
                    path = (source_node,) + path
                    result.add(path)
            result = frozenset(result)
            # Memoize answer
            memo_dict[(source_node, sink_node)] = result
            return result

This also allows you to save the memoization dictionary between invocations so if you need to compute the answer for multiple source and sink nodes you can avoid a lot of extra effort.

Omnifarious
Cool. Can you explain to me how you did this?
Tyler
@Tyler, yeah, I'm trying to remember exactly how it went. :-)
Omnifarious
Awesome, this would be really useful.
Tyler
@Tyler - I realized that the algorithm I wrote only bears a passing resemblance to what you want. I'm going to think about this really hard for a bit, but in the meantime I'm deleting my answer.
Omnifarious
I'm getting `TypeError: can only concatenate tuple (not "instance") to tuple` with `path = (source_node,) + path`
Tyler
@Tyler - Fixed that. :-)
Omnifarious
A: 

I'm not sure if there are special optimizations available -- before looking for any of them, I'd do a simple recursive solution, something like (using of networkx only the feature that indexing a graph by a node gives an iterable yielding its neighbor nodes [[a dict, in networkx's case, but I'm not making use of that in particular]])...:

def allpaths(G, source_nodes, set_of_sink_nodes, path_prefix=()):
  set_of_result_paths = set()
  for n in source_nodes:
    next_from_n = []
    for an in G[n]:
      if an in set_of_sink_nodes:
        set_of_result_paths.add(path_prefix + (n, an))
      else:
        next_from_n.append(an)
    if next_from_n:
      set_of_result_paths.update(
          allpaths(G, next_from_n, set_of_sink_nodes, path_prefix + (n,)))
  return set_of_result_paths

This should be provably correct (but I'm not going to do the proof because it's very late and I'm tired and fuzzy-headed;-) and usable to verify any further optimizations;-).

First optimization I'd try would be some kind of simple memoizing: if I've already computed the set of paths from some node N to any goal node (whatever the prefix leading to N was when I did that computation), I can stash that away in a dict under key N and avoid further recomputations if and when I get to N again by a different route;-).

Alex Martelli
I would up-vote your post for its coolness, but I'd feel an obligation to verify its correctness first.
Eric W.
also, `set_of_results_paths.update(allpaths(` should be `set_of_result_paths.update(allpaths(`
Tyler
@Tyler, tx for spotting the typo, fixed.
Alex Martelli