views:

606

answers:

7

The OCaml GC imposes a global lock that prevents mutators (threads) from running in parallel although they can run concurrently (interleaved). I believe the same is true of SML/NJ and MLton but not PolyML, GHC, F#, Clojure and Scala.

What other functional language implementations allow threads to run in parallel?

+6  A: 

Haskell supports parallel threads via Data Parallel Haskell

Brian Agnew
Data Parallel Haskell is still quite experimental, but the parallel thread support in your first link is much more mature. DPH is implemented with parallel threads, not vice versa.
Ganesh Sittampalam
Thanks for clarifying that
Brian Agnew
+5  A: 

I'm happy to tell you that you're right and that F#, being based on the CLR, doesn't suffer from that limitation at all, and instead benefits from multithreading specific features including async workflows, the mailboxprocessor, and the wonderful upcoming (.NET 4.0) Task Parallel Library.

emaster70
+5  A: 

Scala and Clojure are both running on the JVM, which allows real concurrency without any single point of contention bottlenecks.

Kevin Peterson
Can you recommend a link to anything about parallel programming in Scala?
Jon Harrop
Scala supports Actors http://www.scala-lang.org/node/242 if you like that model, and you can use java.util.concurrency just as you would in Java. I suppose given the question is about functional programming, Actors is more appropriate.
Kevin Peterson
+3  A: 

In addition to Haskell, you can run processes concurrently in Erlang (Concurrency-Oriented Programming) and you can also do so in F# using .NET Parallel Extensions and Asynchronous Workflows.

Ben Griswold
+6  A: 

Erlang implements its own processes and process schedule and allows thousands, tens of thousands and even millions of Erlang processes (inside a single Operating System process).

In SMP and multi-core machines the Erlang Virtual Machine will allocate as many OS threads and OS processes to its process scheduler and process queue to maximise its use of underlying concurrent operations in the hardware architecture.

The concurrency paradigm exposed to the applications remains the same, of course.

Gordon Guthrie
+10  A: 

There are a number of good implementations out there. At the moment, the Haskell people seem to be getting the best results (see ICFP 2009 paper by Simon Marlow and others as well as Haskell Symposium 2009 paper by Donnie Jones and others). Erlang is quite close behind, especially if you want to go distributed.

In six to twelve months the answers may have changed :-)

Norman Ramsey
Those Haskell results are worthless: their parallel implementations are often orders of magnitude slower than most serial implementations.
Jon Harrop
@Jon: where's your evidence?
Norman Ramsey
Compare the performance measurements quoted in the papers you cited with equivalent programs written in other languages. For example, the second paper gives performance results for parallel quicksort (~14s to sort only 100k ints) that are 100x slower than a decent solution in most other languages. You can also benchmark their parallel programs yourself. I did an exhaustive study of Saynte's naive parallelization applied to four of Lennart's Haskell implementations of my ray tracer and found that one stopped scaling at 6 cores and all of the others stopped scaling at only 5 cores.
Jon Harrop
In comparison, I have written parallel linear algebra codes in F# that outperform the vendor-tuned Fortran in the Intel MKL when running on Intel hardware.
Jon Harrop
A: 

python is not a particularly functional language, and with the GIL, it's not very parallel, either, but combined with the multiprocessing module (standard since 2.6), you get both, but it's not quite as elegant as pure functional languages.

Brief example:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)              # start 4 worker processes
    result = pool.apply_async(f, [10])     # evaluate "f(10)" asynchronously
    print result.get(timeout=1)           # prints "100" unless your computer is *very* slow
    print pool.map(f, range(10))          # prints "[0, 1, 4,..., 81]"
TokenMacGuy
I didn't expect this response to be terribly popular, but please explain a bit what you find so objectionable about this technique?
TokenMacGuy
My question was specifically about threads and not processes.
Jon Harrop
CPython has an issue involving concurrent kernel threads, preventing it's full use. the multiprocessing module allows you to use a nearly identical interface with processes instead of threads to achieve nearly identical results. This issue is not present in other flavours of python because they lack the GIL.
TokenMacGuy
I know. OCaml has the same issue. That is exactly what I'm trying to get away from! :-)
Jon Harrop