I have a data set which is a large unweighted cyclic graph The cycles occur in loops of about 5-6 paths. It consists of about 8000 nodes and each node has from 1-6 (usually about 4-5) connections. I'm doing single pair shortest path calculations and have implemented the following code to do a breadth-first search.
from Queue import Queue
q = Queue()
parent = {}
fromNode = 'E1123'
toNode = 'A3455'
# path finding
q.put(fromNode)
parent[fromNode] = 'Root'
while not q.empty():
# get the next node and add its neighbours to queue
current = q.get()
for i in getNeighbours(current):
# note parent and only continue if not already visited
if i[0] not in parent:
parent[i[0]] = current
q.put(i[0])
# check if destination
if current == toNode:
print 'arrived at', toNode
break
The above code uses the Python 2.6 Queue module and getNeighbours() is simply a subroutine that makes a single MySQL call and returns the neighbours as a list of tuples e.g. (('foo',),('bar',)). The SQL call is quick.
The code works ok however testing to down to depths of about 7 layers takes about 20 seconds to run (2.5GHz Intel 4GB RAM OS X 10.6)
I'd welcome any comments about how to improve the run time of this code.