views:

321

answers:

4

I have received the task to make a social graph, where, with one user in the center, it shows the connections he has. But before we can reach that, our focus is how we can determine the shortest path between 2 users. I found some algorithm to do it, but it seems that it takes a lot of time, and because it's about social links, we are looking for one that is the fastest because we will need to run it on a regular basis to keep up with the updates in friends.

So, do you know which would be the fastest way to determine the shortest path between two users?

PS: If you know an example in PHP & MySQL, I will give you a virtual beer (or coke). :D

+1  A: 

It seems to me that if you're going to draw the whole graph anyway, the easiest would be to keep track of each person's path, as they are added to the graph. So for the person's friends, the path is just "main person -> friend". Then, as you add each of THEIR friends to the graph, store the path "main person -> friend 1 -> friend 2" etc.

If the picture in my mind is accurate, that seems like the easiest way to do it, but I might be misunderstanding somewhat.

MatrixFrog
+7  A: 

Dijkstra's Algorithm finds the shortest path between two nodes on a graph.

Jason Punyon
+1  A: 

what you want is an all-pairs shortest path algorithm; if you have to generate the pairs globally for the graph it is quicker than running a shortest path algorithm for every pair. keeping this updated is another problem - note that you'll have to do it every time you add a connection to the graph, not just every time you add a person. if this is for a production site, it might be worth maintaining the graph generation as an offline task written in a language faster than php, and writing its results back to the db. you'll probably find existing c++ implementations out there.

Martin DeMello
A: 

Dijsktra's algorithm works well on weighted graphs. In a social graph all edges have the same weight. So Dijkstra's algorithm becomes BFS. However on a dense graph the list of nodes examined at each level will be huge. One optimization you can do is, to start the search from both ends (A and B).