views:

144

answers:

2

If I had a single server and I had two process types A(Many processes many threads) and B(one process n-threads with n-cpu's), and I wanted to send a LARGE amount of one-way messages from A to B. Is MPI a better implementation for this than a custom implementation using:

  1. Unix Domain Sockets
  2. Windows Named Pipes
  3. Shared Memory

I was thinking of writing my own library based on 1 and 2, and I am also wondering if 3 is better since the shared memory would require locking.

Process A provides external services so B's resource usage and the message passing in general needs to consume as little resources as possible, and A could be implemented in both blocking or non-blocking when it sends messages. Resource usage of B and the message passing needs to scale linearly with A's usage.

I eventually need broadcasting capability between machines as well. Probably for process B.

My parting question is: is MPI (openMPI in particular) a good library for this, and does it use the most optimal kernel primitives on various operating systems.

+1  A: 

MPI is pretty efficient, it was built for high performance applications.
You can even use it for communications between CPUs on the same mboard very well.

I'm not sure about broadcasts, the system I used years ago didn't but I can't remember if that was a limitation of our interconnect or of MPICH.

ps. We used MPICH because at the time it worked best on Windows and we needed that flexibility, I haven't used MPICH2 or OpenMPI.

Martin Beckett
MPICH - nice tip, There seems to be lots of native documentation for it as well -- i.e., I don't have to go read the mpi-forum spec documentation.
Hassan Syed
If you want to use MPICH, I strongly recommend looking at MPICH2: http://www.mcs.anl.gov/research/projects/mpi/mpich2/ The original MPICH is fairly out of date at this point. OpenMPI, in my experience, is nicer and performs better than MPICH, though, esepcially if you use its shared memory communication channels on a single machine.
Reed Copsey
+1  A: 

MPI will most likely work well for this, provided you are willing to rework your architecture to fit it's message passing infrastructure.

Theoretically, at least when hosted on a single server, you may be able to do something faster if you wrap your own library, just because you won't have to do the transition into and out of the MPI message structures. That being said, MPI is very efficient (esp. MPI-2, which Open MPI supports), and very, very robust. You'd have a difficult time getting the same flexibility, configurability, and robustness out of your own library.

If you're going to be broadcasting between multiple machines, MPI is probably a better approach than trying to roll your own method.

Also, MPI supports quite a few modes of communication. It does support shared memory for very fast, single machine communication, as well as TCP for inter-machine communication (plus some commercial, faster options).

Reed Copsey
can shared memory still be better in a "x-sender - one receiver" scenario - I assume with a single named interprocess lock(IP-Lock) , contention will become a problem. This is my naive view of things of-course. I guess what I am asking is does openMPI or MPICH have clever data structures to spread the cost of IP-Locking so that it scales better than named pipes and unix-dom-sockets.
Hassan Syed
Shared memory can still be better in this situation. However, when you're doing things like this, there is no substitution for profiling. Unfortunately, highly scalable, parallized apps like this tend to require profiling, often on the actual hardware being used, since the answer may change dramatically based on specific hardware.
Reed Copsey