views:

173

answers:

2

The functional programming provides

  • stateless way of programming that enables the user free of side effect.
  • declarative way of programming that enables the user describe the problem sets that the computer should solve, not the way of the problems are solved.

Combining those features, users can program in higher level that can be swapped in terms of running sequence.

I mean, if we have expression A, and B, running

A
B

gives the same results as running

B
A

And, A is decomposed into many instructions to be run on one thread, and same is true with B.

  • Is my understanding correct?
  • Is there any features that makes functional language the choice for parallel programming/multi-core programming?
  • (If my understanding is correct) How the compiler can map a thread to an expressions?
+1  A: 

First, I must say that I have only recently got into functional programming. I am by no means an expert and am grateful for any feedback, in case I've got something wrong. I am hoping that the following is helpful in some way.

Features that are an advantage in concurrent programming:

Flexible execution order: The execution environment might be able to better determine than the programmer in what order expressions ought to be evaluated so that they can be distributed over several threads or CPUs.

However, note that even in functional programming languages, the execution order cannot be shuffled randomly. Take for example this code:

let c = someFunction(a, b)
let d = someFunction(c)

The second expression depends on the result of the first expression (c). Therefore the first expression will have to be evaluated first.

What really hinders re-arranging execution order most is functions that have side effects, ie. functions that do not simply return a value based on their arguments, but also modify program state in some way (via global or static variables, through file or network I/O, etc.). If a function can mutate a program's state, and if functions do not only depend on (ie. read) their own arguments, but also on the program's state, execution order suddenly becomes very important; because a function may execute in a different environment or context every time that it's called, and it may therefore also return a different value every time. With such functions, it's the programmer's responsibility to manage program state by prescribing the exact order in which the program-state-modifying functions are called.

In functional programming, immutability of data is therefore another very important and related issue. Ideally, there is no mutable program state at all. If all values are mutable, ie. if there are no variables, you'll have a hard time changing a program's state! This, together with side effect-free functions, more or less guarantees that a function call always evaluates to the same value, given the same input, and therefore it doesn't matter when the function will be called (making execution order less important). It doesn't even matter which program thread will call the function, which helps a lot in concurrent programming.


Contrast this with "regular" imperative programs that have state and mutable variables, and probably even Singleton objects. In some instances, those can be viewed as an idealised form of mutable program state, and they're often said to be very problematic in concurrent programming. Because several threads might operate on and mutate a singleton object, you have to work really hard so that it always functions as expected and so that it cannot get corrupted by simultaneous writes.


How a compiler assigns single expressions to different threads

... or even CPUs, I can only guess:

  • The compiler may attempt to figure out by means of some code analysis how functions depend on one another (not an easy task, since functions can be passed around as values);

  • or the programmer is allowed to give hints to the execution environment, either through specialized language constructs (async blocks?) or through library calls;

  • or it is known that certain operations on certain data structures (such as applying a map/projection function to an array) can almost always be performed in parallel.

stakx
+1  A: 

You are describing purely functional programming which was touted as a panacea for parallel programming but has since failed to produce any compelling results. See my criticism of a recent research paper for this here and watch my video lecture here that walks through a flawed example written in a purely functional style and explains why it does not work as researchers had expected.

The advantages of purely functional programming in this context were supposed to be:

  • Purely functional style leads to programs with far more latent parallelism that should be easier to break up and parallelize.

  • Effects in the type system gives the compiler knowledge of all dependencies between expressions and, therefore, paves the way for automatic parallelization by the compiler.

  • Pure functions do not explicitly compete for shared resources like locks so they were expected to parallelize perfectly.

The disadvantages turned out to be:

  • Although it is easy to write an effective garbage collector for an impure language (e.g. see HLVM), purely functional data structures require massive allocation rates and that, in turn, requires incredible efficiency in the garbage collector that is extremely difficult to attain (read more). For example, the Glasgow Haskell Compiler suspends all threads at every minor collection which quickly destroys the scalability of code that uses it, which is almost all purely functional code.

  • Effective parallelization requires a granularity that is not too fine (or the overhead of administrating work items dominates) and not too coarse (or work distribution is poor) but declarative style deliberately abstracts away the method of execution and, therefore, renders performance wildly unpredictable. This makes it extremely difficult to choose an appropriate granularity.

  • Unlike supercomputers, multicore programming requires much more attention to caches and locality. Today, this can only be accomplished by carefully mutating data structures in-place.

  • Pure functions do actually compete for hidden shared resources: the allocator/GC and main memory bandwidth. So much so, in fact, that they often see no speedup for parallelization at all.

In theory, some people still hope for a sufficiently smart compiler that magically addresses all of these problems. In practice, even the state-of-the-art compilers fail to make the idiomatic pure "quicksort" run within 1,000× the performance of a conventional solution. So we are nowhere near the required level of sophistication.

In reality, impure functional style works well for parallel programs because first-class lexical closures are an excellent abstraction for conveying portions of code to be executed in different ways. For example, .NET 4 provides a parallel for loop in the form of a higher-order Parallel.For function that accepts a first-class function as an argument.

Jon Harrop
OTOH, using ideas from functional programming can give good results, see MapReduce.
ninjalj
SQL is a great example of a pure-functional (on the statement level) language whose implicit parallelism makes it easy to go faster than equivalent procedural code. The SQL engine can figure out the right parallel granularity, make sure that cache lines aren't shared, manage memory/disk bandwidth properly, and so on.
Gabe
@ninjalj: Yes, exactly.
Jon Harrop
@Gabe: A *lot* of people seem to disagree: "In fact, it is not uncommon to see SQL Server pick a parallel plan which is considerably slower than a non-parallel plan" - Erland Sommarskog, SQL Server MVP http://bytes.com/topic/sql-server/answers/144731-sql-2000-parallelism-worth
Jon Harrop
@Gabe: Why did Oracle add explicit parallelism hints if implicit parallelism works well?
Jon Harrop
Jon: Think of the SQL query optimizer like an optimizing compiler. Sometimes they make the code go slower, and sometimes they need hints (e.g. inline, register) to make it go faster. We still use them because it's a whole lot easier than writing it all in assembly!
Gabe