views:

1259

answers:

9

I've read somewhere that functional programming is suitable to take advantage of multi-core trend in computing. I didn't really get the idea. Is it related to the lambda calculus and von neumann architecture?

+8  A: 

Functional programming minimizes or eliminates side effects and thus is better suited to distributed programming. i.e. multicore processing.

In other words, lots of pieces of the puzzle can be solved independently on separate cores simultaneously without having to worry about one operation affecting another nearly as much as you would in other programming styles.

Chris Ballance
Got any concrete, benchmarked examples?
ja
@ja: I criticized some of their recent benchmark results here: http://flyingfrogblog.blogspot.com/2010/06/regular-shape-polymorphic-parallel.html
Jon Harrop
+4  A: 

The basic argument is that it is difficult to automatically parallelize languages like C/C++/etc because functions can set global variables. Consider two function calls:

a = foo(b, c);
d = bar(e, f);

Though foo and bar have no arguments in common and one does not depend on the return code of the other, they nonetheless might have dependencies because foo might set a global variable (or other side effect) which bar depends upon.

Functional languages guarantee that foo and bar are independant: there are no globals, and no side effects. Therefore foo and bar could be safely run on different cores, automatically, without programmer intervention.

DGentry
A: 

Omitting any technical/scientific terms the reason is because functional program doesn't share data. Data is copied and transfered among functions, thus there is no shared data in the application.

And shared data is what causes half the headaches with multithreading.

Robert Gould
Functional programs almost always share data.
Jon Harrop
+1  A: 

This is a little bit of a vague question. One perk of multi-core CPUs is that you can run a functional program and let it plug away serially without worrying about affecting any computing going on that has to do with other functions the machine is carrying out.

The difference between a multi-U server and a multi-core CPU in a server or PC is the speed savings you get by having it on the same BUS, allowing better and faster communication to the cores.

edit: I should probably qualify this post by saying that in most of the scripting I do, with or without multiple cores, I rarely see a problem in getting my data through hackish parallelizing, such as running multiple small scripts at once in my script so I'm not slowed down by things like waiting for URLs to load and what not.

double edit: Furthermore, a lot of functional programming languages have had forked parallel variants for decades. These better utilize parallel computation with some speed improvement, but they never really caught on.

Robert Elwell
+5  A: 

One of the hardest things about dealing with parallel processing is locking data structures to prevent corruption. If two threads were to mutate a data structure at once without having it locked perfectly, anything from invalid data to a deadlock could result.

In contrast, functional programming languages tend to emphasize immutable data. Any state is kept separate from the logic, and once a data structure is created it cannot be modified. The need for locking is greatly reduced.

Another benefit is that some processes that parallelize very easily, like iteration, are abstracted to functions. In C++, You might have a for loop that runs some data processing over each item in a list. But the compiler has no way of knowing if those operations may be safely run in parallel -- maybe the result of one depends on the one before it. When a function like map() or reduce() is used, the compiler can know that there is no dependency between calls. Multiple items can thus be processed at the same time.

John Millikin
+5  A: 

When there are no side effects the order of evaluation does not matter. It is then possible to evaluate expressions in parallel.

Jonas
+3  A: 

All the answers above go to the key idea that "no shared mutable storage" is a key enabler to execute pieces of a program in parallel. It does not really solve the equally hard problem of finding things to execute in parallel. But the typical clearer expressions of functionality in functional languages do make it theoretically easier to extract parallelism from a sequential expression.

In practice, I think the "no shared mutable storage" property of languages based on garbage collection and copy-on-change semantics make them easier to add threads to. The best example is probably Erlang, that combines near-functional semantics with explicit threads.

jakobengblom2
A: 

The book Programming Erlang: Software for a Concurrent World by Joe Armstrong (the creator of Erlang) talks quite a bit about using Erlang for multicore(/multiprocessor) systems. As the wikipedia article states:

Creating and managing processes is trivial in Erlang, whereas threads are considered a complicated and error-prone topic in most languages. Though all concurrency is explicit in Erlang, processes communicate using message passing instead of shared variables, which removes the need for locks.

KeyserSoze
A: 

I've read somewhere that functional programming is suitable to take advantage of multi-core trend in computing... I didn't really get the idea. Is it related to the lambda calculus and von neumann architecture?

The argument behind the belief you quoted is that purely functional programming controls side effects which makes it much easier and safer to introduce parallelism and, therefore, that purely functional programming languages should be advantageous in the context of multicore computers.

Unfortunately, this belief was long since disproven for several reasons:

  • The absolute performance of purely functional data structures is poor. So purely functional programming is a big initial step in the wrong direction in the context of performance (which is the sole purpose of parallel programming).

  • Purely functional data structures scale badly because they stress shared resources including the allocator/GC and main memory bandwidth. So parallelized purely functional programs often obtain poor speedups as the number of cores increases.

  • Purely functional programming renders performance unpredictable. So real purely functional programs often see performance degradation when parallelized because granularity is effectively random.

For example, the bastardized two-line quicksort often cited by the Haskell community typically runs thousands of times slower than a real in-place quicksort written in a more conventional language like F#. Moreover, although you can easily parallelize the elegant Haskell program, you are unlikely to see any performance improvement whatsoever because all of the unnecessary copying makes a single core saturate the entire main memory bandwidth of a multicore machine, rendering parallelism worthless. In fact, nobody has ever managed to write any kind of generic parallel sort in Haskell that is competitively performant. The state-of-the-art sorts provided by Haskell's standard library are typically hundreds of times slower than conventional alternatives.

However, the more common definition of functional programming as a style that emphasizes the use of first-class functions does actually turn out to be very useful in the context of multicore programming because this paradigm is ideal for factoring parallel programs. For example, see the new higher-order Parallel.For function from the System.Threading.Tasks namespace in .NET 4.

Jon Harrop
conventional language like F# ?
Surya
Yes. Convention meaning "not purely functional" in that context.
Jon Harrop