views:

874

answers:

10
+3  Q: 

MPI or Sockets

I'm working on a loosely coupled cluster for some data processing. The network code and processing code is in place, but we are evaluating different methodologies in our approach. Right now, as we should be, we are I/O bound on performance issues, and we're trying to decrease that bottleneck. Obviously, faster switches like Infiniband would be awesome, but we can't afford the luxury of just throwing out what we have and getting new equipment.

My question posed is this. All traditional and serious HPC applications done on clusters is typically implemented with message passing versus sending over sockets directly. What are the performance benefits to this? Should we see a speedup if we switched from sockets?

A: 

MPI uses sockets underneath, so really the only difference should be the API that your code interfaces with. You could fine tune the protocol if you are using sockets directly, but thats about it. What exactly are you doing with the data?

Greg Rogers
A: 

MPI Uses sockets, and if you know what you are doing you can probably get more bandwidth out of sockets because you need not send as much meta data.

But you have to know what you are doing and it's likely to be more error prone. essentially you'd be replacing mpi with your own messaging protocol.

Omar Kooheji
+7  A: 

I would recommend using MPI instead of rolling your own, unless you are very good at that sort of thing. Having wrote some distributed computing-esque applications using my own protocols, I always find myself reproducing (and poorly reproducing) features found within MPI.

Performance wise I would not expect MPI to give you any tangible network speedups - it uses sockets just like you. MPI will however provide you with much the functionality you would need for managing many nodes, i.e. synchronisation between nodes.

freespace
A: 

For high volume, low overhead business messaging you might want to check out OAMQ with several products. The open source variant OpenAMQ supposedly runs the trading at JP Morgan, so it should be reliable, shouldn't it?

pklausner
+11  A: 

MPI MIGHT use sockets. But there are also MPI implementation to be used with SAN (System area network) that use direct distributed shared memory. That of course if you have the hardware for that. So MPI allows you to use such resources in the future. On that case you can gain massive performance improvements (on my experience with clusters back at university time, you can reach gains of a few orders of magnitude). So if you are writting code that can be ported to higher end clusters, using MPI is a very good idea.

Even discarding performance issues, using MPI can save you a lot of time, that you can use to improve performance of other parts of your system or simply save your sanity.

OldMan
+1  A: 

I have not used MPI, but I have used sockets quite a bit. There are a few things to consider on high performance sockets. Are you doing many small packets, or large packets? If you are doing many small packets consider turning off the Nagle algorithm for faster response:

setsockopt(m_socket, IPPROTO_TCP, TCP_NODELAY, ...);

Also, using signals can actually be much slower when trying to get a high volume of data through. Long ago I made a test program where the reader would wait for a signal, and read a packet - it would get a bout 100 packets/sec. Then I just did blocking reads, and got 10000 reads/sec.

The point is look at all these options, and actually test them out. Different conditions will make different techniques faster/slower. It's important to not just get opinions, but to put them to the test. Steve Maguire talks about this in "Writing Solid Code". He uses many examples that are counter-intuitive, and tests them to find out what makes better/faster code.

Dan Hewett
+1  A: 

I'll have to agree with OldMan and freespace. Unless you know of a specific and improvement to some useful metric (performance, maintainability, etc.) over MPI, why reinvent the wheel. MPI represents a large amount of shared knowledge regarding the problem you are trying to solve.

There are a huge number of issues you need to address which is beyond just sending data. Connection setup and maintenance will all become your responsibility. If MPI is the exact abstraction (it sounds like it is) you need, use it.

At the very least, using MPI and later refactoring it out with your own system is a good approach costing the installation and dependency of MPI.

I especially like OldMan's point that MPI gives you much more beyond simple socket communication. You get a slew of parallel and distributed computing implementation with a transparent abstraction.

Stephen Pellicer
+1  A: 

Message Passing is a paradigm not a technology. In the most general installation, MPI will use sockets to communicate. You could see a speed up by switching to MPI, but only in so far as you haven't optimized your socket communication.

How is your application I/O bound? Is it bound on transferring the data blocks to the work nodes, or is it bound because of communication during computation?

If the answer is "because of communication" then the problem is you are writing a tightly-coupled application and trying to run it on a cluster designed for loosely coupled tasks. The only way to gain performance will be to get better hardware (faster switches, infiniband, etc. )... maybe you could borrow time on someone else's HPC?

If the answer is "data block" transfers then consider assigning workers multiple data blocks (so they stay busy longer) & compress the data blocks before transfer. This is a strategy that can help in a loosely coupled application.

ceretullis
The I/O is sending job data before a run, and sending results afterwords.
Nicholas Mancuso
+2  A: 

Performance is not the only consideration in this case, even on high performance clusters. MPI offers a standard API, and is "portable." It is relatively trivial to switch an application between the different versions of MPI.

Most MPI implementations use sockets for TCP based communication. Odds are good that any given MPI implementation will be better optimized and provide faster message passing, than a home grown application using sockets directly.

In addition, should you ever get a chance to run your code on a cluster that has InfiniBand, the MPI layer will abstract any of those code changes. This is not a trivial advantage - coding an application to directly use OFED (or another IB Verbs) implementation is very difficult.

Most MPI applications include small test apps that can be used to verify the correctness of the networking setup independently of your application. This is a major advantage when it comes time to debug your application. The MPI standard includes the "pMPI" interfaces, for profiling MPI calls. This interface also allows you to easily add checksums, or other data verification to all the message passing routines.

semiuseless
+1  A: 

MPI has the benefit that you can do collective communications. Doing broadcasts/reductions in O(log p) /* p is your number of processors*/ instead of O(p) is a big advantage.

Chad Brewbaker