views:

304

answers:

2

I'm trying to figure out what is the best practice while designing parallel algorithm for model of data distribution. What could be pros and cons against block distribution vs cyclic distribution of data in memory. Any help would be appreciated.

+1  A: 

Block distribution in shared memory is best suited for algorithms that break their work up into chunks that require little (or no) synchronisation while the algorithm runs.

Parallel algorithm design should focus of minimising synchronisation bottlenecks between processes. An example would be a Gauss-Seidel relaxation method on a 2-D grid, where the grid is split into strips (1 per processor) and no synchronistion is performed. Each processor calculates an independent convergence value and terminates when that figure is reached.

You should also take into account data locality of reference, which can have a marked effect on parallel and sequential algorithms.

Mitch Wheat
"Block distribution in shared memory is best suited for algorithms that break their work up into chunks that require little (or no) synchronisation while the algorithm runs." -- This statement is not necessary a true, since I can find cyclic decomposition which will give me a set of independent task, while in block I will be unable to do it.
Artem Barger
@Artem Barger: you are correct. I should have mentioned the underlying topology, i.e. grid, torus, etc...
Mitch Wheat
Gauss-Seidel is an almost pathologically bad algorithm to "do" in parallel. What you are suggesting is line-Jacobi with Gauss-Seidel within each strip, an algorithm that converges extremely slowly (compared to proper Gauss-Seidel which is already terrible).
Jed
+1  A: 

Quinn's "Parallel Programming in C with MPI and OpenMP" offers lots of examples of different ways to distribute data in parallel programming. There's even a decision-tree that helps you figure out which approach is most convenient, depending on your needs.

nairdaen