views:

709

answers:

11

I was reading this blog post http://www.cilk.com/multicore-blog/bid/8097/Don-t-get-caught-with-your-multicore-pants-down and got me asking this question.

4-cores or 8-cores will be a common thing in 12-24 months time and I got a chill realizing that I don't have a answer for that yet.

+3  A: 

I use C++ with multiple worker threads as required. I have a number of algorithms, e.g. matrix and surface modelling, that would benefit from less granular multi-threading if I had lots more cores, but 2-4 cores aren't going to give enough of a boost to go down this road yet.

MS are also putting much more for parallel processing in VS10, see the following video. It appears they wont be supporting the OpenMP 3.0 specification.

Shane MacLaughlin
+2  A: 

Note that programming for mutli-core processors is no different than programming for traditional SMP systems (except maybe for some very low-level optimizations), so any advice you can find for multi-processor systems applies just as well to multi-core processors.

See also:

Joachim Sauer
-1: Multicores are different from traditional SMP. Communication is much faster and main memory bandwidth is much more precious. Consequently, many parallel algorithms published in the 1990s do not scale well on multicores.
Jon Harrop
@Jon: interesting. Got any sources for further information?
Joachim Sauer
@Joachim: Not directly but I highly recommend MIT's work on cache oblivious algorithms. They really pay off on multicores because cache misses destroy scalability (but they weren't as useful on SMP machines where memory bandwidth was more plentiful). I'm writing an article and book on it now... :-)
Jon Harrop
Incidentally, it is really tricky to interpret benchmark results correctly in this context even though the performance benefits are significant. For example, I just benchmarked cache unaware, cache aware and cache oblivious matrix multiplies written in F# running (redundantly) on all 8 of my cores and got a 2.7x slowdown for the cache unaware algorithm but repeating the same experiment on a single code you don't see any difference: it only affects scalability!
Jon Harrop
+4  A: 

Plain old C# and the Thread Pool for me - although very keen to start looking at F#

Paul Nearney
+1  A: 

The OS will schedule you threads, just pick a language/library that makes using threads easy and safe ( via concurrent primitives, locks, mutex, latch, etc). Many languages support this, even some compilers (see OpenMP)

The other part is that you need to be AWARE that you're on a system w/ multiple logical processors and break your tasks accordingly. Example: If you know you have multple cores, maybe you can break a long running calculation over an array into multiple threads for multiple pieces of the array (based on number of cpus)

basszero
+6  A: 

We are using Java 5 and 6.

Our application lends itself well to splitting into multiple threaded queues. The book Java Concurrency in Practice by Brian Goetz is very good for explaining this.

We have also found that splitting large applications into smaller processes works well too. The OS (Linux) does a good job of scheduling processes to make better use of the multiple cpus/cores. Separate processes are also easier to write than highly threaded code.

Fortyrunner
+1, separate processes also gives you the chance to scale to multiple machines
orip
A: 

Today is a good day, so in C and a tiny bit of assembly I'm building a lock-free allocator for a parallel traits based language. But then I'm between contracts at the moment, so can afford to play with things I can only understand on a good day.

Professionally I use C++ or Java, depending mostly on what the rest of the team I'm with is happiest with, and appropriate libraries.

This is the one area which garbage collection significantly helps. When moving an existing system to exploit multiple cores, one written in a garbage collected language compared to one using RAII or simple reference counting implementations (without CAS), the gc one is much easier if the ownership of objects can move between threads.

Pete Kirkham
A: 

I am using C# with the ThreadPool, BackgroundWorker and Thread instances. I didn't know where to start. However, I bought the e-book version of the book "C# 2008 and 2005 threaded programming", by Gaston C. Hillar - Packt Publishing - http://www.packtpub.com/beginners-guide-for-C-sharp-2008-and-2005-threaded-programming/book, 7 days ago. I bought the e-book from the publishers, but now the book is available at Amazon.com. Highly recommended for C# programmers. I downloaded the code and I began following the exercises. The book is a nice guide with a lot of code to practice. I read the first 6 chapters. It tells stories while it explains the most difficult concepts. That's good. It's nice to read. I could see my Core 2 Quad Q6700 reach 98% CPU usage programming in C# using 4 concurrent threads!! It is easier than I thought. I am impressed with the results you can achieve using many cores at the same time. I recommend the book to those who are interested in beginning with multicore or threaded programming using C#.

+1  A: 

Scala with its actor support.

A: 

PARLANSE. See http://www.semdesigns.com/Products/Parlanse/index.html

This language runs on SMP x86 systems with 1-32 CPUs. It offers "fine grain" paralellism based primarily on static partial orders and a compiler that synthesizes as much of the scheduling code for a grain switch as it can, to keep overhead low.

PARLANSE has been used to implement DMS, a program analysis and transformation system of some several million lines of code.

Ira Baxter
A: 

As far as the popular dynamic languages go, Perl has extensive support for concurrency via interpreter threads or forks and IPC. Unlike many of the other thread implementations in dynamic languages (including most flavors of Python and Ruby which suffer from global interpreter locks), Perl's threads are real OS level threads that will automatically be farmed out to multiple cores by the OS. This of course means you are left with all of the data sharing issues surrounding real threads, but there are many modules on CPAN to help with that.

And anecdotally, I recently converted a small cpu bound number crunching perl script (around 50 lines, with the inner loop written in C) to use threads, and in all, the threading code only added about 10 lines, and now the script runs at least 4 times faster on my i7 system.

Eric Strom
A: 

F# with the new task parallelism infrastructure in .NET 4.

I'm currently studying all of the major classes of parallel algorithms and how they are efficiently parallelized for multicores in order to factor out reusable design patterns that will let me write multicore-capable code much more easily in the future. This seems incredibly important to me but, surprisingly, nobody seems to be working on it.

Jon Harrop