tags:

views:

73

answers:

1

I recenty asked a question about parallel programming algorithms which was closed quite fast due to my bad ability to communicate my intent:

http://stackoverflow.com/questions/2407631/what-is-the-most-useful-parallel-programming-algorithm-closed

I had also recently asked another question, specifically:

http://stackoverflow.com/questions/2407493/is-mapreduce-such-a-generalisation-of-another-programming-principle/2407570#2407570

The other question was specifically about map reduce and to see if mapreduce was a more specific version of some other concept in parallel programming. This question (about a useful parallel programming algorithm) is more about the whole series of algorithms for parallel programming. You will have to excuse me though as I am quite new to parallel programming, so maybe MapReduce or something that is a more general form of mapreduce is the "only" parallel programming construct which is available, in which case I apologise for my ignorance.

A: 

There's probably two "main" parallel programming constructs.

Map/Reduce is one. At a high, ultra-generic level, it's just parallel divide-and-conquer. Send out the individual bits to parallel handlers, and combine the results when they arrive.

The other main parallel programming construct is the pipeline... pieces of work go through a series of stages, each of which can be run in a parallel thread.

I think that just about any parallelization algorithm is going to boil down to one of those two. I could be wrong, of course.

kyoryu
Isn't mapreduce just a the same as the pipeline algorithm, using two pipes, one to send the computation and one to get the answer?
Zubair
@Zubair: Not really... map reduce sends things to multiple 'pipes' in parallel, while a pipeline will send everything down the same sequential series of pipes. So, map/reduce looks like A->(B,C,D,E)->F (B,C,D,E get the request in parallel, and their answer goes straight to F), while a pipeline looks like A->B->C->D->E->F - each element sends its data to the next element, so everything that gets to F has gone through A through E already. IOW, in a pipeline, each step takes the output from the previous step.
kyoryu
@kyoryu. Thanks for the clarification. If everything in a pipeline has to be processed in a serial manner then where is the parallelism?
Zubair
@Zubair: Let's say that the job is broken into chunks. First, element "A" handles the first chunk. When it is done, it passes the modified chunk to element "B", which handles it on a different thread/in a different process. While element "B" is handling the first chunk, "A" starts on the second. Once they're done, they each pass down to the next element... so "C" is handling the first chunk, "B" the second, and "A" the third. So even though *each chunk* is handled sequentially, the *steps* in the pipeline are done in parallel.
kyoryu