ansaurus

Question

How to improve the performance of this Haskell program?

Answer 1

+13 A:

For testing purpose I have just set searchTo = 100000. The time taken is 7.34s. A few modification leads to some big improvement:

Use an Integer instead of Int64. This improves the time to 1.75s.

Use an accumulator (you don't need sequenceLength to be lazy right?) 1.54s.

seqLen2 :: Int -> Integer -> Int
seqLen2 a 1 = a
seqLen2 a n = seqLen2 (a+1) (nextNumber n)


sequenceLength :: Integer -> Int
sequenceLength = seqLen2 1

Rewrite the nextNumber using quotRem, thus avoiding computing the division twice (once in even and once in div). 1.27s.

nextNumber :: Integer -> Integer
nextNumber n 
    | r == 0    = q
    | otherwise = 6*q + 4
    where (q,r) = quotRem n 2

Use Schwartzian transform instead of maximumBy. The problem of maximumBy . comparing is that the sequenceLength function is called more than once for each value. 0.32s.
```
longestSequence = snd $ maximum [(sequenceLength a, a) | a <- [1..searchTo]]
```

Note:

I check the time by compiling with ghc -O and run with +RTS -s)
My machine is running on Mac OS X 10.6. The GHC version is 6.12.2. The compiled file is in i386 architecture.)
The C problem runs at 0.078s with the corresponding parameter. It is compiled with gcc -O3 -m32.

KennyTM 2010-09-05 14:56:48

OK that's really interesting. I assumed (mistakenly obviously) that the arbitrary-sized Integer type would be slower than a 64-bit Int64 type. Also, I assumed tail-call recursion would be optimized to a loop. Do you have any links for these sorts of hints?

stusmith 2010-09-05 15:04:58

@stusmith: `1 + (sequenceLength next)` is not really tail recursive because `sequenceLength` is not at top level. For optimization hints, see http://book.realworldhaskell.org/read/profiling-and-optimization.html

KennyTM 2010-09-05 15:20:46

@stusmith: if you're on a 64-bit OS using Int64 may be faster, but the `Integer` type is very heavily optimized to use word-sized data when possible. Since that's true most of time in this problem, Integer is the faster choice.

John 2010-09-05 15:31:48

@stusmith: This is an example, where Lisp-style prefix notation or Forth-style postfix notation is easier to read than mathematical mixfix notation. In Lisp, the last line of `sequenceLength` would be `(+ 1 (sequenceLength next))`, in Forth it would be `next sequenceLength 1 +`. In both cases, it's easy to see that `+` is in the tail position, not `sequenceLength`, ergo the function is *not* tail recursive. You can even see that in Haskell, if you write everything in prefix (aka function) notation: `sequenceLength n = (+) 1 (sequenceLength next)`

Jörg W Mittag 2010-09-05 15:47:03

Excellent answer, thankyou. The lack of tail-recursion now seems obvious! It's a real shame that certain functions need to be avoided for good performance (I still can't see why maximumBy needs to call the comparator function more than once per element). It also seems a shame that everything is allocated on the heap - taking your suggestions I get a timing of just over 10s, ie 10x slower than C - and I suspect that's due to Haskell using the heap as opposed to registers.

stusmith 2010-09-05 16:27:29

@stusmith: The comparator function is called once per pair of argument, but `comparing sequenceLength` as that comparator function calls `sequenceLength` twice. Even worse, the time taken for `sequenceLength` is proportional to its output, and you are finding the maximum...

KennyTM 2010-09-05 16:46:05

Answer 2

+2 A:

Haskell's lists are heap-based, whereas your C code is exceedingly tight and makes no heap use at all. You need to refactor to remove the dependency on lists.

DeadMG 2010-09-05 15:04:28

Answer 3

+3 A:

The comparing may be recomputing sequenceLength too much. This is my best version:

type I = Integer
data P = P {-# UNPACK #-} !Int {-# UNPACK #-} !I deriving (Eq,Ord,Show)

searchTo = 1000000

nextNumber :: I -> I
nextNumber n = case quotRem n 2 of
                  (n2,0) -> n2
                  _ -> 3*n+1

sequenceLength :: I -> Int
sequenceLength x = count x 1 where
  count 1 acc = acc
  count n acc = count (nextNumber n) (succ acc)

longestSequence = maximum . map (\i -> P (sequenceLength i) i) $ [1..searchTo]

main = putStrLn $ show $ longestSequence

The answer and timing are slower than C, but it does use arbitrary precision Integer:

ghc -O2 --make euler14-fgij.hs
time ./euler14-fgij
P 525 837799

real 0m3.235s
user 0m3.184s
sys  0m0.015s

Chris Kuklewicz 2010-09-05 16:00:15

Answer 4

+2 A:

Even if I'm a bit late, here is mine, I removed the dependency on lists and this solution uses no heap at all too.

{-# LANGUAGE BangPatterns #-}
-- Compiled with ghc -O2 -fvia-C -optc-O3 -Wall euler.hs
module Main (main) where

searchTo :: Int
searchTo = 1000000

nextNumber :: Int -> Int
nextNumber n = case n `divMod` 2 of
   (k,0) -> k
   _     -> 3*n + 1

sequenceLength :: Int -> Int
sequenceLength n = sl 1 n where
  sl k 1 = k
  sl k x = sl (k + 1) (nextNumber x)

longestSequence :: Int
longestSequence = testValues 1 0 0 where
  testValues number !longest !longestNum
    | number > searchTo     = longestNum
    | otherwise            = testValues (number + 1) longest' longestNum' where
    nlength  = sequenceLength number
    (longest',longestNum') = if nlength > longest
      then (nlength,number)
      else (longest,longestNum)

main :: IO ()
main = print longestSequence

I compiled this piece with ghc -O2 -fvia-C -optc-O3 -Wall euler.hs and it runs in 5 secs, compared to 80 of the beginning implementation. It doesn't uses Integer, but because I'm on a 64-bit machine, the results may be cheated.

The compiler can unbox all Int's in this case, resulting in really fast code. It runs faster than all other solutions I've seen so far, but C is still faster.

FUZxxl 2010-09-26 07:02:22

ansaurus

tags:

views:

answers:

How to improve the performance of this Haskell program?

related questions