Can anyone point me to some resources on how to do this? I'm using networkx
as my python library.
Thanks!
Can anyone point me to some resources on how to do this? I'm using networkx
as my python library.
Thanks!
This is based on Alex Martelli's answer, but it should work. It depends on the expression source_node.children
yielding an iterable that will iterate over all the children of source_node
. It also relies on there being a working way for the ==
operator to compare two nodes to see if they are the same. Using is
may be a better choice. Apparently, in the library you're using, the syntax for getting an iterable over all the children is graph[source_node]
, so you will need to adjust the code accordingly.
def allpaths(source_node, sink_node):
if source_node == sink_node: # Handle trivial case
return frozenset([(source_node,)])
else:
result = set()
for new_source in source_node.children:
paths = allpaths(new_source, sink_node, memo_dict)
for path in paths:
path = (source_node,) + path
result.add(path)
result = frozenset(result)
return result
My main concern is that this does a depth first search, it will waste effort when there are several paths from the source to a node that's a grandchild, great grandchild, etc all of source, but not necessarily a parent of sink. If it memoized the answer for a given source and sink node it would be possible to avoid the extra effort.
Here is an example of how that would work:
def allpaths(source_node, sink_node, memo_dict = {}):
if source_node == sink_node: # Don't memoize trivial case
return frozenset([(source_node,)])
else:
pair = (source_node, sink_node)
if pair in memo_dict: # Is answer memoized already?
return memo_dict[pair]
else:
result = set()
for new_source in source_node.children:
paths = allpaths(new_source, sink_node, memo_dict)
for path in paths:
path = (source_node,) + path
result.add(path)
result = frozenset(result)
# Memoize answer
memo_dict[(source_node, sink_node)] = result
return result
This also allows you to save the memoization dictionary between invocations so if you need to compute the answer for multiple source and sink nodes you can avoid a lot of extra effort.
I'm not sure if there are special optimizations available -- before looking for any of them, I'd do a simple recursive solution, something like (using of networkx only the feature that indexing a graph by a node gives an iterable yielding its neighbor nodes [[a dict, in networkx's case, but I'm not making use of that in particular]])...:
def allpaths(G, source_nodes, set_of_sink_nodes, path_prefix=()):
set_of_result_paths = set()
for n in source_nodes:
next_from_n = []
for an in G[n]:
if an in set_of_sink_nodes:
set_of_result_paths.add(path_prefix + (n, an))
else:
next_from_n.append(an)
if next_from_n:
set_of_result_paths.update(
allpaths(G, next_from_n, set_of_sink_nodes, path_prefix + (n,)))
return set_of_result_paths
This should be provably correct (but I'm not going to do the proof because it's very late and I'm tired and fuzzy-headed;-) and usable to verify any further optimizations;-).
First optimization I'd try would be some kind of simple memoizing: if I've already computed the set of paths from some node N to any goal node (whatever the prefix leading to N was when I did that computation), I can stash that away in a dict under key N and avoid further recomputations if and when I get to N again by a different route;-).