views:

112

answers:

2

I have a huge graph that I would like to process using many machines.

I had like to compute if the graph diameter is higher than 50.

How would I split the data and I would I write a parallel algorithm that can calculate it? (the return value is boolean)

The graph diameter is the greatest distance between any pair of vertices

+1  A: 

The standard way to figure this out would be an all-pairs shortest path algorithm -- the Floyd-Warshall algorithm is a good place to start. Another option using Hadoop is located here.

Joel
how would you parallelize the Floyd-Warshall algorithm?
MrOhad
@MrOhad You could find the source for Floyd-Warshall (parallel) here http://pcl.cs.ucla.edu/projects/maisie/tutorial/programming/samples/apsp.m explanation is here http://pcl.cs.ucla.edu/projects/maisie/tutorial/programming/
belisarius
Actually, he doesn't want a parallel algorithm, he wants a distributed one. Hence the hadoop link.
Joel
I understood the Floyd-Warshall algorithm, which is an awesome one! :-)the questions is, how do I split it to many computers? it's a DP algorithm, all it does is filling a Matrix row by row(n times)...I would live to hear your opinion..Moreover, this is a nice way to find out all the shortest path from each pair of vertices, I am actually looking for the max() on that list, is it really the best way to do so? Thanks!
MrOhad
Each row can be computed on a separate computer, if I remember correctly, with some message passing between them. Sadly, since you are interested in weighted graphs, there isn't really much that can make it faster than the all pairs computation -- at least not that I've seen in the literature. What you could do is bail after you hit 50, though, which should put an upper limit the time, at least.
Joel
@Joel, how would u solve it without weights?
MrOhad
The other answer addresses that.
Joel
http://www.mcs.anl.gov/~itf/dbpp/text/node35.html here are few ways to parallel F-W
MrOhad