views:

37

answers:

1

Now I have a serial solver in C++ for solving optimization problems and I am supposed to parallelize my solver with different parameters to see whether it can help improve the performance of the solver. Now I am not sure whther I should use TBB or MPI. From a TBB book I read, I feel TBB is more suitable for looping or fine-grained code. Since I do not have much experience with TBB, I feel it is difficult to divide my code to small parts in order to realize the parallelization. In addition, from the literature, I find many authors used MPI to parallel several solvers and make it cooperate. I guess maybe MPI fits my need more. Since I do not have much knowledge on either TBB or MPI. Anyone can tell me whether my feeling is right? Will MPI fit me better? If so, what material is good for start learning MPI. I have no experience with MPI and I use Windows system and c++. Thanks a lot.

+1  A: 

The basic thing you need to have in mind is to choose between shared-memory and distributed-memory. Shared-memory is when you have more than one process (normally more than one thread within a process) that can access a common memory. This can be quite fine-grained and it is normally simpler to adapt a single-threaded program to have several threads. You will need to design the program in a way that the threads work most of the time in separate parts of the memory (exploit data parallelism) and that the shared part is protected against concurrent accesses using locks. Distributed-memory means that you have different processes that might be executed in one or several distributed computers but these process have together a common goal and share data through message-passing (data communication). There is no common memory space and all the data one process need from another process will require communication. It is a more general approach but, because of communication requirements, it requires coarse grains. TBB is a library support for thread-based shared-memory parallelism while MPI is a library for distributed-memory parallelism (it has simple primitives for communication and also scripts for several processes in different nodes execution).

The most important thing is for you to identify the parallelisms within your solver and then choose the best solution. Do you have data parallelism (different thread/processes could be working in parallel in different chunks of data without the need of communication or sharing parts of this data)? Task parallelism (different threads/processes could be performing a different transformation to your data or a different step in the data processing in a pipeline or graph fashion)?

Edu