tags:

views:

267

answers:

3

ok, so I'm doing ProjectEuler Problem #14, and I'm fiddling around with optimizations in order to feel f# out.

in the following code:

let evenrule n = n / 2L
let oddrule n = 3L * n + 1L

let applyRule n =
    if n % 2L = 0L then evenrule n
    else oddrule n

let runRules n =
    let rec loop a final =
        if a = 1L then final
        else loop (applyRule a) (final + 1L)
    n, loop (int64 n) 1L


let testlist = seq {for i in 3 .. 2 .. 1000000 do yield i } 

let getAns sq = sq |> Seq.head

let seqfil (a,acc) (b,curr) = if acc = curr then (a,acc) else if acc < curr then (b,curr) else (a,acc)

let pmap f l = 
    seq { for a in l do yield async {return f a} }
    |> Seq.map Async.RunSynchronously

let pmap2 f l = 
    seq { for a in l do yield async {return f a} }
    |> Async.Parallel
    |> Async.RunSynchronously

let procseq f l = l
                  |> f runRules
                  |> Seq.reduce seqfil
                  |> fst

let timer = System.Diagnostics.Stopwatch()
timer.Start()
let ans1 = testlist |> procseq Seq.map // 837799    00:00:08.6251990
printfn "%A\t%A" ans1 timer.Elapsed
timer.Reset()

timer.Start()
let ans2 = testlist |> procseq pmap
printfn "%A\t%A" ans2 timer.Elapsed // 837799   00:00:12.3010250
timer.Reset()

timer.Start()
let ans3 = testlist |> procseq pmap2
printfn "%A\t%A" ans3 timer.Elapsed // 837799   00:00:58.2413990
timer.Reset()

Why does the Async.Parallel code run REALLY slow in comparison to the straight up map? I know I shouldn't see that much of an effect, since I'm only on a dual core mac.

Please note that I do NOT want help solving problem #14, I just want to know what's up with my parallel code.

+4  A: 

The use of Async.Parallel seems to be correct. The numbers really look suspicious, but I don't immediately see what may be a problem here.

In any case, asynchronous workflows are really more suitable for computations that involve some asynchronous operation (such as I/O, communication, waiting for events, etc.). For CPU intensive tasks, it is better to use Parallel Extensions to .NET (which are now a part of .NET 4.0; unfortunatelly there is no .NET 2.0 version).

To do that from F#, you'll need F# PowerPack and the FSharp.PowerPack.Parallel.Seq.dll assembly, which contains parallel versions of higher-order functions for working with sequences (such as map :-))

These functions return a value of type pseq<'a> (called ParallelQuery<T> in C#), which represent a delayed computation running in parallel (this enables better optimizations when you use multiple operations in pipeline). There is also PSeq.reduce function, so you may want to use this in your processing too (aside from PSeq.map).

Tomas Petricek
Curses! This limits experiments with Parallel Extensions to my PC. Hurry up, mono! Get us up to 4.0!
JBristow
It looks like Mono folks are certainly interested in Parallel Extensions http://tirania.org/blog/archive/2008/Jul-26-1.html, but I'm not sure about the current status... I wrote an implementation of `PSeq.map` and `PSeq.filter` myself some time ago (http://tomasp.net/blog/fsparallelops.aspx), but the performance is probably not the best...
Tomas Petricek
There is a version of the Parallel Extensions for .NET 2.0. It ships as part of the Reactive Extensions.
forki23
+1  A: 

On my box, it takes about half a second to run the non-async one. Since there are about a half a million work items here, that means each one runs in about 1 microsecond on average.

These items are too small to try to parallelize individually. The overhead of doing the thread-scheduling and synchronization will dominate the actual work. You need to choose better granularity.

(I'll work on some quick code...)

EDIT:

Ok, the code as-is is not too easy to re-do to change the batch granularity without some major rewriting, so I don't have new code to share that doesn't give too much away. But I was able to make it run nearly twice as fast on my 2-core box by making batches of 50,000 elements each and doing the "map reduce" on each batch and a "parallel-map reduce" on the 10 or so batches.

See also

http://blogs.msdn.com/pfxteam/archive/2008/08/12/8849984.aspx

especially the "task granularity" section.

Brian
This article helped open my eyes a little more. I encourage everyone else to read it too!
JBristow
A: 

I just want to know what's up with my parallel code

Heh, what isn't wrong with your parallel code. ;-)

Here's how I'd solve the problem:

  let rec inside (a : _ array) n =
    if n <= 1L || a.[int n] > 0s then a.[int n] else
      let p =
        if n &&& 1L = 0L then inside a (n >>> 1) else
          let n = 3L*n + 1L
          if n < int64 a.Length then inside a n else outside a n
      a.[int n] <- 1s + p
      1s + p
  and outside (a : _ array) n =
    let n = if n &&& 1L = 0L then n >>> 1 else 3L*n + 1L
    1s + if n < int64 a.Length then inside a n else outside a n

  let euler14 n =
    let a = Array.create (n+1) 0s
    let a = Array.Parallel.init (n+1) (fun n -> inside a (int64 n))
    let i = Array.findIndex (Array.reduce max a |> (=)) a
    i, a.[i]

This program uses speculative parallelism with a safe race condition and achieves a modest 2× speedup on my 8 cores.

Jon Harrop