views:

358

answers:

4

I'm working through the problems in Project Euler as a way of learning Haskell, and I find that my programs are a lot slower than a comparable C version, even when compiled. What can I do to speed up my Haskell programs?

For example, my brute-force solution to Problem 14 is:

import Data.Int
import Data.Ord
import Data.List

searchTo = 1000000

nextNumber :: Int64 -> Int64
nextNumber n
    | even n    = n `div` 2
    | otherwise = 3 * n + 1

sequenceLength :: Int64 -> Int
sequenceLength 1 = 1
sequenceLength n = 1 + (sequenceLength next)
    where next = nextNumber n

longestSequence = maximumBy (comparing sequenceLength) [1..searchTo]

main = putStrLn $ show $ longestSequence

Which takes around 220 seconds, while an "equivalent" brute-force C version only takes 1.2 seconds.

#include <stdio.h>

int main(int argc, char **argv)
{
    int longest = 0;
    int terms = 0;
    int i;
    unsigned long j;

    for (i = 1; i <= 1000000; i++)
    {
        j = i;
        int this_terms = 1;

        while (j != 1)
        {
            this_terms++;

            if (this_terms > terms)
            {
                terms = this_terms;
                longest = i;
            }

            if (j % 2 == 0)
                j = j / 2;
            else
                j = 3 * j + 1;
        }
    }

    printf("%d\n", longest);
    return 0;
}

What am I doing wrong? Or am I naive to think that Haskell could even approach C's speed?

(I'm compiling the C version with gcc -O2, and the Haskell version with ghc --make -O).

+13  A: 

For testing purpose I have just set searchTo = 100000. The time taken is 7.34s. A few modification leads to some big improvement:

  1. Use an Integer instead of Int64. This improves the time to 1.75s.

  2. Use an accumulator (you don't need sequenceLength to be lazy right?) 1.54s.

    seqLen2 :: Int -> Integer -> Int
    seqLen2 a 1 = a
    seqLen2 a n = seqLen2 (a+1) (nextNumber n)
    
    
    sequenceLength :: Integer -> Int
    sequenceLength = seqLen2 1
    
  3. Rewrite the nextNumber using quotRem, thus avoiding computing the division twice (once in even and once in div). 1.27s.

    nextNumber :: Integer -> Integer
    nextNumber n 
        | r == 0    = q
        | otherwise = 6*q + 4
        where (q,r) = quotRem n 2 
    
  4. Use Schwartzian transform instead of maximumBy. The problem of maximumBy . comparing is that the sequenceLength function is called more than once for each value. 0.32s.

    longestSequence = snd $ maximum [(sequenceLength a, a) | a <- [1..searchTo]]
    

Note:

  • I check the time by compiling with ghc -O and run with +RTS -s)
  • My machine is running on Mac OS X 10.6. The GHC version is 6.12.2. The compiled file is in i386 architecture.)
  • The C problem runs at 0.078s with the corresponding parameter. It is compiled with gcc -O3 -m32.
KennyTM
OK that's really interesting. I assumed (mistakenly obviously) that the arbitrary-sized Integer type would be slower than a 64-bit Int64 type. Also, I assumed tail-call recursion would be optimized to a loop. Do you have any links for these sorts of hints?
stusmith
@stusmith: `1 + (sequenceLength next)` is not really tail recursive because `sequenceLength` is not at top level. For optimization hints, see http://book.realworldhaskell.org/read/profiling-and-optimization.html
KennyTM
@stusmith: if you're on a 64-bit OS using Int64 may be faster, but the `Integer` type is very heavily optimized to use word-sized data when possible. Since that's true most of time in this problem, Integer is the faster choice.
John
@stusmith: This is an example, where Lisp-style prefix notation or Forth-style postfix notation is easier to read than mathematical mixfix notation. In Lisp, the last line of `sequenceLength` would be `(+ 1 (sequenceLength next))`, in Forth it would be `next sequenceLength 1 +`. In both cases, it's easy to see that `+` is in the tail position, not `sequenceLength`, ergo the function is *not* tail recursive. You can even see that in Haskell, if you write everything in prefix (aka function) notation: `sequenceLength n = (+) 1 (sequenceLength next)`
Jörg W Mittag
Excellent answer, thankyou. The lack of tail-recursion now seems obvious! It's a real shame that certain functions need to be avoided for good performance (I still can't see why maximumBy needs to call the comparator function more than once per element). It also seems a shame that everything is allocated on the heap - taking your suggestions I get a timing of just over 10s, ie 10x slower than C - and I suspect that's due to Haskell using the heap as opposed to registers.
stusmith
@stusmith: The comparator function is called once per pair of argument, but `comparing sequenceLength` as that comparator function calls `sequenceLength` twice. Even worse, the time taken for `sequenceLength` is proportional to its output, and you are finding the maximum...
KennyTM
+2  A: 

Haskell's lists are heap-based, whereas your C code is exceedingly tight and makes no heap use at all. You need to refactor to remove the dependency on lists.

DeadMG
+3  A: 

The comparing may be recomputing sequenceLength too much. This is my best version:

type I = Integer
data P = P {-# UNPACK #-} !Int {-# UNPACK #-} !I deriving (Eq,Ord,Show)

searchTo = 1000000

nextNumber :: I -> I
nextNumber n = case quotRem n 2 of
                  (n2,0) -> n2
                  _ -> 3*n+1

sequenceLength :: I -> Int
sequenceLength x = count x 1 where
  count 1 acc = acc
  count n acc = count (nextNumber n) (succ acc)

longestSequence = maximum . map (\i -> P (sequenceLength i) i) $ [1..searchTo]

main = putStrLn $ show $ longestSequence

The answer and timing are slower than C, but it does use arbitrary precision Integer:

ghc -O2 --make euler14-fgij.hs
time ./euler14-fgij
P 525 837799

real 0m3.235s
user 0m3.184s
sys  0m0.015s
Chris Kuklewicz
+2  A: 

Even if I'm a bit late, here is mine, I removed the dependency on lists and this solution uses no heap at all too.

{-# LANGUAGE BangPatterns #-}
-- Compiled with ghc -O2 -fvia-C -optc-O3 -Wall euler.hs
module Main (main) where

searchTo :: Int
searchTo = 1000000

nextNumber :: Int -> Int
nextNumber n = case n `divMod` 2 of
   (k,0) -> k
   _     -> 3*n + 1

sequenceLength :: Int -> Int
sequenceLength n = sl 1 n where
  sl k 1 = k
  sl k x = sl (k + 1) (nextNumber x)

longestSequence :: Int
longestSequence = testValues 1 0 0 where
  testValues number !longest !longestNum
    | number > searchTo     = longestNum
    | otherwise            = testValues (number + 1) longest' longestNum' where
    nlength  = sequenceLength number
    (longest',longestNum') = if nlength > longest
      then (nlength,number)
      else (longest,longestNum)

main :: IO ()
main = print longestSequence

I compiled this piece with ghc -O2 -fvia-C -optc-O3 -Wall euler.hs and it runs in 5 secs, compared to 80 of the beginning implementation. It doesn't uses Integer, but because I'm on a 64-bit machine, the results may be cheated.

The compiler can unbox all Int's in this case, resulting in really fast code. It runs faster than all other solutions I've seen so far, but C is still faster.

FUZxxl