ansaurus

Question

How do I write a parallel reduction using strategies in Haskell?

Answer 1

+1 A:

This seems like a good start:

parFold :: (a -> a -> a) -> [a] -> a
parFold f = go
  where
  strategy = parList rseq

  go [x] = x
  go xs = go (reduce xs `using` strategy)

  reduce (x:y:xs) = f x y : reduce xs
  reduce list     = list   -- empty or singleton list

It works, but parallelism is not so great. Replacing parList with something like parListChunks 1000 helps a bit, but speedup is still limited to under 1.5x on an 8-core machine.

Chad Scherrer 2010-10-26 21:42:51

Answer 2

+4 A:

I don't think a list is the right data type for this. Because it's just a linked list, the data will necessarily be accessed sequentially. Although you can evaluate the items in parallel, you won't gain much in the reduction step. If you really need a List, I think the best function would be just

parFold f = foldl1' f . withStrategy (parList rseq)

or maybe

parFold f = foldl1' f . withStrategy (parBuffer 5 rseq)

If the reduction step is complex, you might get a gain by subdividing the list like this:

parReduce f = foldl' f mempty . reducedList . chunkList . withStrategy (parList rseq)
 where
  chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls
  reducedList = parMap rseq (foldl' f mempty)

I've taken the liberty of assuming your data is a Monoid for mempty, if this isn't possible you can either replace mempty with your own empty type, or worse case use foldl1'.

There are two operators from Control.Parallel.Strategies in use here. The parList evaluates all items of the list in parallel. After that, the chunkList divides the list into chunks of 1000 elements. Each of those chunks is then reduced in parallel by the parMap.

You might also try

parReduce2 f = foldl' f mempty . reducedList . chunkList
 where
  chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls
  reducedList = parMap rseq (foldl' f mempty)

Depending on exactly how the work is distributed, one of these may be more efficient than the others.

If you can use a data structure that has good support for indexing though (Array, Vector, Map, etc.), then you can do binary subdivisions for the reduction step, which will probably be better overall.

John 2010-10-26 22:32:01

Thanks, John. I like the idea of using foldl' over chunks. But after each chunk is reduced, the outer foldl' is sequential, and its input could be very large. What's the best way to express the recursion? The input may or may not be a list, but this should be expressible using strategies.

Chad Scherrer 2010-10-27 18:11:19

The `parMap` function in `reducedList` will evaluate all of the chunks in parallel. But if your input is so large that you don't want to load it all in memory at once, then you can use laziness and parBuffer. I've had very good success with `parBuffer` because it lets you exploit parallelism and laziness. I think it will work if you use `reducedList = withStrategy (parBuffer 10 rseq) . map (foldl' f mempty)`. I think this is better than recursion for Lists because you avoid multiple traversals.

John 2010-10-27 19:01:32

ansaurus

tags:

views:

answers:

How do I write a parallel reduction using strategies in Haskell?

related questions