I understand the differences between a multithreaded program and a program relying on inter-machine communication. My problem is that I have a nice multithreaded program written in 'C' that works and runs really well on an 8-core machine. There is now opportunity to port this program to a cluster to gain access to more cores. Is it worth the effort to rip out the pthread stuff and retrofit MPI (which I've never used) or are we better off recoding the whole thing (or most of it) from scratch? Assume we are "stuck" with C so a wholesale change of language isn't an option.
Depending on how your software is written, there may or may not be advantages to going to MPI over keeping your pthread implementation.
Unfortunately (or fortunately), message passing is a very different beast than pthreading - the basic assumption is quite different. I love this quote from Joshua Phillips of the Maestro team: "The difference between message-passing and shared-state communication is equivalent to the difference between sending a colleague an e-mail requesting her to complete a task and opening up her organizer to write down the task directly in her to-do list. More than just being rude, the latter is likely to confuse her – she might erase it, not notice it, or accidentally prioritize it incorrectly."
Unfortunately, the way you share data is very different. There is no direct access to data in other threads (since it can be on other machines), so it can be a very daunting task to migrate from pthreads to MPI. On the other hand, if the code is written so each thread is isolated, it can be an easy task, and definitely worthwhile.
In order to determine how useful this will be, you'll need to understand the code, and what you hope to achieve by switching. It can be worthwhile as a learning experience (you learn a LOT about synchronization and threading by working in MPI), but may not be practical if the gains will be minor.
Re. your comment to Reed -- this sounds like an easy, low-overhead conversion to MPI. Just be careful: not all MPI APIs support dynamic creation of processes, i.e., you start your program with N processes (specified at startup) and you're stuck with N processes throughout the life-time of the program.