views:

194

answers:

5

I'm sorry that I couldn't figure out a way to express the question more clearly in the headline, but essentially, it's this: almost all functional languages have constructs that allow you to process a variable list of arguments through tail recursion, as in this Erlang-ish pseudocode that sums up a list of numbers:

sumup(0,A) -> A.
sumup(N,A) -> sumup(N) + A.

However, one of the big appeals of functional languages to me is their inherent parallelism. And even though a problem such as summing up a list of numbers is obviously quite parallelizable, and would almost certainly be most efficiently handled by divide-and-conquer, I'm not aware of language features that make this a natural way to program. In fact, unless the language has features that allow reading the number of arguments based to a function and retrieving arguments based on index, I don't see how one could do it. Do any functional languages have features to encourage divide-and-conquer programming?

A: 

This sort of thing is fairly easy to specify in Ruby. In this example, we split a range into groups of three, and dispatch a sum method on each group. We then sum the resulting partial-sums. You can quite easily expand this to make it multi-threaded.

(1..10).each_slice(3).map{ |x| x.inject :+ }.inject(:+)

This example is a little different from yours, but shows the principle.

Peter
+1  A: 

I'm not familiar with any languages with divide-and-conquer type patterns. As you say, it's hard to imagine how you'd specify something like that.

Without wildly new notation, I think classic functions like partition are the best we can do.

David Seiler
How would I find out more about that function? I've been trying to research it but can't seem to narrow down my search enough.
partition just takes a list and divides it into two sublists based on some predicate. If you look up functional implementations of quicksort, you'll see it everywhere.
David Seiler
+3  A: 

Automatic parallelism is not as easy as it might seem. The problem with this is that if partitioning is done automatically, there's the risk of overpartitioning (too many partitions), which would add too much overhead, or underpartitioning, which would not take proper advatange of all the cores in your CPUs. Figuring this out statically (i.e. at compile-time) is quite difficult, which is why it's usually left to the developer to annotate where to parallelize.

Examples:

Haskell has the par combinator, which serves as an annotation to create a spark, a computation that is turned into a thread when a CPU core becomes available.

Data Parallel Haskell: defines a parallel array data type to allow a more implicit style of parallelization but it seems to come at the cost of some limitations, and is still experimental code.

(disclaimer: I'm not a Haskell developer)

The Task Parallel Library in .NET: can do automatic parallelization of data, or you can implement your own Partitioner. You still need to know how parallelization works, or you'll end up with over- or underpartitioning. Reed Corpsey has a great series of articles on the TPL and PLINQ.

DryadLINQ builds on PLINQ and adds automatic distributed computing.

None of these are really native to the the language, but they're tightly integrated. There's even a PLINQ integration module for F#.

Mauricio Scheffer
+4  A: 

Do any functional languages have features to encourage divide-and-conquer programming?

Yes: the ability to create new higher-order functions in libraries.

One of the most important such functions, on lists anyway, is foldr, which can be parallelized in principle when applied to an associative operator, although this is rarely done in practice. Why? Because foldr is designed around sequential data flow.

The beauty of functional languages is that once this problem is recognized, we can address the problem not by introducing new language features, but by making more intelligent use of the features we already have. To see how, look at Guy Steele's talk from August 2009, where he explains why foldr is not the right library function for parallel functional programming, and he proposes

  • A new programming style
  • A new representation of lists
  • New higher-order functions for libraries

that are all designed to support divide-and-conquer programming.

What I found so exciting about this talk is that it is not necessary to introduce new language features to support divide-and-conquer programming "natively". It is enough to take the primitives we already have and to use them to design better libraries.

If you have access to the ACM Digital library you can see video of Guy's talk Organizing Functional Code For Parallel Execution, or as Ben Karel points out, you can see video taken by Malcom Wallace on Vimeo.

Norman Ramsey
Guy's talk is available on Vimeo: http://www.vimeo.com/6624203. Also, the Project Fortress team has other slides and papers online that afeldspar would probably find interesting: http://research.sun.com/projects/plrg/Publications/index.html
Ben Karel
So is this what you got from his talk? I thought his talk was about how what language-level features we currently have are not enough to design better libraries. To each their own, eh? (caveat: I sincerely, no bullshit believe that a truly revolutionary idea is one that several people can interpret in completely opposite ways)
Pascal Cuoq
@Pascal: Of course he was pushing Fortress, but I definitely saw Guy's talk as all about libraries. (Come to think of it, this is also how I see Fortress.) If we want this model in Haskell, we already have `par`, so all we have to do is throw off the chains imposed by our cons cells! No compilers need be harmed, but we would have to rewrite all the libraries...
Norman Ramsey
A: 

Have a look at Manticore, its predecessor NESL, and its sibling ZPL. All of these are at least partially functional languages with parallel constructs for operating on the entire contents of data structures at a time.

Novelocrat