views:

504

answers:

7

Hi guys,

So it looks like multicore and all its associated complications are here to stay. I am planning a software project that will definitely benefit from parallelism. The problem is that I have very little experience writing concurrent software. I studied it at University and understand the concepts and theory very well but have zero hands on useful experience building software to run on on multiple processors since school.

So my question is, what is the best way to get started with multiprocessor programming? I am familiar with mostly Linux development in C/C++ and Obj-C on Mac OS X with almost zero Windows experience. Also my planned software project will require FFT and probably floating point comparisons of a lot of data.

There is OpenCL, OpenMP, MPI, POSIX threads, etc... What technologies should I start with?

Here are a couple stack options I am considering but not sure if they will let me experiment in working towards my goal:

  • Should I get Snow Leopard and try to get OpenCL Obj-C programs to run execution on the ATI X1600 GPU on my laptop? or
  • Should I get a Playstation and try writing C code to throw across its six available Cell SPE cores? or
  • Should I build out a Linux box with an Nvidia card and try working with CUDA?

Thanks in advance for your help.

-Talesh.

+1  A: 

You should Learn You Some Erlang. For great good.

Warren Young
+1  A: 

If you're interested in parallelism in OS X, make sure to check out Grand Central Dispatch, especially since the tech has been open-sourced and may soon see much wider adoption.

phoebus
Do you know if GCD will allow me to run FFT on a X1600 GPU?
Talesh
You want to be looking at OpenCL in that case, not GCD. GCD is for CPUs, OpenCL is for GPUs.
Warren Young
A: 

You don't need special hardware like graphic cards and Cells to do parallel programming. Your simple multi-core CPU will also profit from parallel programming. If you have experience with C/C++ and objective-c, start with one of those and learn to use threads. Start with simple examples like matrix multiplication or maze solving and you'll learn about those pesky problems (parallel software is non-deterministic and full of Heisenbugs).

If you want to go into the massive multiparallelism, I'd choose openCL as it's the most portable one. Cuda still has a larger community, more documentation and examples and is a bit easier, but you'd an nvidia card.

MattW.
I am thinking that CUDA may probably be my best bet since the community is a little more developed, Nvidia seems to be committed to HPC and well... its C and Linux. I still don't know if GPU's will be able to run stuff like FFT, or even if that question makes sense at all!
Talesh
FFT makes a lot of sense. See for example http://www.macresearch.org/cuda-quick-look-and-comparison-fft-performance
MattW.
+1  A: 

Hi

I'd suggest going OpenMP and MPI initially, not sure it matters which you choose first, but you definitely ought to want (in my opinion :-) ) to learn both shared and distributed memory approaches to parallel computing.

I suggest avoiding OpenCL, CUDA, POSIX threads, at first: get a good grounding in the basics of parallel applications before you start to wrestle with the sub-structure. For example, it's much easier to learn to use broadcast communications in MPI than it is to program them in threads.

I'd stick with C/C++ on your Mac since you are already familiar with them, and there are good open-source OpenMP and MPI libraries for that platform and those languages.

And, and for some of us it's a big plus, whatever you learn about C/C++ and MPI (to a lesser extent it's true of OpenMP too) will serve you well when you graduate to real supercomputers.

All subjective and argumentative, so ignore this if you wish.

Regards

Mark

High Performance Mark
Hi Mark,Do you have a couple links you can send me to get started with OpenMP and MPI? -Talesh-Talesh
Talesh
www.google.com -- material on these topics is very easy to find and the top hits are the most useful sites
High Performance Mark
+1  A: 

The traditional and imperative 'shared state with locks' isn't your only choice. Rich Hickey, the creator of Clojure, a Lisp 1 for the JVM, makes a very compelling argument against shared state. He basically argues that it's almost impossible to get right. You may want to read up on message passing ala Erlang actors or STM libraries.

arcticpenguin
A: 

Maybe your problem is suitable for the MapReduce paradigm. It automatically takes care of load balancing and concurrency issues, the research paper from Google is already a classic. You have a single-machine implementation called Mars that run on GPUs, this may work fine for you. There is also Phoenix that runs map-reduce on multicore and symmetric multiprocessors.

leinz
A: 

I would start with MPI as you learn how to deal with distributed memory. Pacheco's book is an oldie but a goodie, and MPI runs fine out of the box on OS X now giving pretty good multicore performance.

Chad Brewbaker

related questions