views:

145

answers:

1

Update: I just found this documentation page. Wish there was a link to it from the documentation that I'd been using, which seemed to be the definitive API doc. But maybe it's a new, unreleased work.

Update 2: This documentation has given me a lot better idea how to use the Control.Parallel.Strategies module. However I haven't quite solved the problem... see end of question.

I've been trying to use parListChunk or some other parallel control features in Haskell. But I can't figure out how to use them. Warning: I'm a Haskell noob. I learned some things about functional programming with Scheme about 20 years ago (!).

Here's my non-parallelized function:

possibKs n r = [ (k, (hanoiRCountK n k r)) | k <- [1 .. n-1] ]

I want to parallelize it, something like this naive attempt:

possibKs n r 
    | n < parCutoff  = results
    | otherwise      = parListChunk parChunkSize results
    where results = [ (k, (hanoiRCountK n k r)) | k <- [1 .. n-1] ]

But that structure isn't right for parListChunk. The docs say:

parListChunk :: Int -> Strategy a -> Strategy [a]

parListChunk sequentially applies a strategy to chunks (sub-sequences) of a list in parallel. Useful to increase grain size

Good, that's what I want. But how to use it? I haven't been able to find any examples of this. If I'm understanding the type declaration, parListChunk is a function that takes an Int and a Strategy<a> (borrowing C++ parametrized type notation to help check that I really am understanding this right), and returns a Strategy<[a]>. In my case I'm dealing with Int for a so parListChunk will need an Int argument and a Strategy<Int>. So what is a Strategy and how do I produce one? And once I have successfully used parListChunk, what do I do with the Strategy it spits out?

The Strategy type is defined like this:

type Strategy a = a -> Done

(And that is all the documentation for Strategy.) So a Strategy<Int> is a function that takes a parameter of type Int and returns Done. Apparently it causes its argument to get evaluated at a certain time or something. Where do I get one, and what kind should I use?

The following functions appear to return Strategies:

sPar :: a -> Strategy b
sSeq :: a -> Strategy b
r0 :: Strategy a
rwhnf :: Strategy a

But none of them let you determine the type parameter -- they produce a Strategy<b> when you give parameter a, or else you don't get to supply parameter a! What's up with that?? Beyond that, I have no idea what these mean.

I did find one example of the similar function parList being used on SO:

return . maximum $ map optimize xs `using` parList

It uses this funky using function, which is declared:

using :: a -> Strategy a -> a

Fair enough... in my case I probably want a to be [Int], so it takes a list of Ints and a Strategy<[Int]> and (does something? applies the strategy to the list? and) returns a list of Ints. So I tried to follow the parList example and changed my otherwise guard to:

| otherwise      = results `using` parListChunk parChunkSize

but I must admit I'm still shooting in the dark... I can't quite follow the type signatures around. So it's not too surprising that the above gives an error:

Couldn't match expected type `[(Int, Integer)]'
       against inferred type `a -> Eval a'
Probable cause: `parListChunk' is applied to too few arguments
In the second argument of `using', namely
    `parListChunk parChunkSize'
In the expression: results `using` parListChunk parChunkSize

Can someone tell me what to use for the Strategy a argument to parListChunk? and how to use the Strategy [a] returned by parListChunk?

New part

Looking at Basic Strategies, I think I need to use the rseq strategy. Probably. So I try

| otherwise      = results `using` (parListChunk parChunkSize rseq)

But GHC says rseq is "not in scope". These API docs say there is no rseq in the package but sSeq seems to have replaced it. OK, so I used sSeq, but it's "not in scope" either. Even though I'm importing Control.Parallel.Strategies.

Any clues? Btw I used to get these messages about loading packages:

Loading package deepseq-1.1.0.0 ... linking ... done.
Loading package parallel-2.2.0.1 ... linking ... done.

So apparently that tells what version of the parallel package I have: 2.2.0.1. But I don't see information in the API docs about what version is described there. If I shouldn't use rseq or sSeq, what should I use? How come Edward was able to use parList?

+1  A: 

OK, I got the code working. I got it to compile by using rwhnf instead of rseq:

| otherwise      = results `using` (parListChunk parChunkSize rwhnf)

According to this source code, rwhnf was renamed to rseq in version 3. So I guess my version of the Parallel package is obsolete with respect to this documentation. :-S

I guess that's part of the price of using "experimental" packages.

Anyway, the code compiles and runs. Whether it's doing anything useful with parallelism is another question...

LarsH
Make sure you run the executable with +RTS -N to use multiple cores. If you enable +RTS -s to show statistics, you can see how efficiently work is being spread across cores. For finer-grained tuning, use ThreadScope (http://research.microsoft.com/en-us/projects/threadscope/).
John
Thanks, @John. In WinGCHi, I have "ghc --interactive -threaded" in the GHCi Startup options. I guess that's not enough? Can I even run multithreaded programs inside WinGCHi?
LarsH
@LarsH, IIRC ghci always uses the threaded runtime (e.g. -threaded is implicit). Just specify the runtime options you'd like in the startup options and you should be fine. If you don't use -N, the default is to stay on one core even with the threaded runtime.
John
Ha... I earned "Scholar" badge for accepting my own answer.
LarsH