views:

562

answers:

4

Disclaimer: I'm working on Euler Problem 9.

I'm adding up some pretty large numbers, all the primes from 1 to 2 000 000.

Summing those primes takes forever. I'm using the haskell built in function 'sum'.

as in:

sum listOfPrimes

Are there any other faster options?

--My prime generator was the slow link in my code.

+10  A: 

It sounds like your problem is not summing the numbers, but generating them. What is your implementation of listOfPrimes?

This paper may be of interest: http://lambda-the-ultimate.org/node/3127

Edward Z. Yang
I concur! That one helped me out tremendously!
DoxaLogos
+1  A: 

I wrote a "Sieve of Eratosthenes" here:

import Data.List
import qualified Data.Map as M
primes :: [Integer]
primes = mkPrimes 2 M.empty
  where
    mkPrimes n m = case (M.null m, M.findMin m) of
        (False, (n', skips)) | n == n' ->
            mkPrimes (succ n) (addSkips n (M.deleteMin m) skips)
        _ -> n : mkPrimes (succ n) (addSkip n m n)
    addSkip n m s = M.alter (Just . maybe [s] (s:)) (n+s) m
    addSkips = foldl' . addSkip

Using this, it takes about 25s to print . sum $ takeWhile (<= 20000000) on my desktop. Could be better? Sure, it takes J under 1 second to run

   +/p:i.p:^:_1]20000000
12272577818052

but it has a pretty optimized prime number generator.

ephemient
+5  A: 

I'm hoping you are using ghc -O2 and not ghci, right? Your problem will be in the generation, not the summation.

One faster way is to use stream fusion-based sequences, which optimize better. With regular lists:

import Data.List
import qualified Data.Map as M

primes :: [Integer]
primes = mkPrimes 2 M.empty
  where
    mkPrimes n m = case (M.null m, M.findMin m) of
        (False, (n', skips)) | n == n' ->
            mkPrimes (succ n) (addSkips n (M.deleteMin m) skips)
        _ -> n : mkPrimes (succ n) (addSkip n m n)
    addSkip n m s = M.alter (Just . maybe [s] (s:)) (n+s) m
    addSkips = foldl' . addSkip

-- fuse:
main = print (sum (takeWhile (<= 2000000) primes))

We get,

$ ghc -O2 --make A.hs
$ time ./A           
142913828922
./A  9.99s user 0.17s system 99% cpu 10.166 total

Switching to streams, so sum . takeWhile fuses:

import qualified Data.List.Stream as S
main = print (S.sum (S.takeWhile (<= 2000000) primes))

Saves some small time,

$ time ./A           
142913828922
./A  9.60s user 0.13s system 99% cpu 9.795 total

But your problem will be prime generation, as we can see if we discard the summation altogether, replacing sum with last:

$ time ./A           
1999993
./A  9.65s user 0.12s system 99% cpu 9.768 total

So find a better prime generator. :-)

Finally, there's a library on Hackage for fast prime generators:

http://hackage.haskell.org/packages/archive/primes/0.1.1/doc/html/Data-Numbers-Primes.html

Using it, our time becomes:

$ cabal install primes
$ cabal install stream-fusion

$ cat A.hs
import qualified Data.List.Stream as S
import Data.Numbers.Primes

main = print . S.sum . S.takeWhile (<= 2000000) $ primes

$ ghc -O2 -fvia-C -optc-O3 A.hs --make

$ time ./A
142913828922
./A  0.62s user 0.07s system 99% cpu 0.694 total
Don Stewart
Since you didn't link to it, http://www.cse.unsw.edu.au/~dons/streams.html | Pretty cool stuff, I didn't realize it was nicely packaged up on Hackage.
ephemient
+3  A: 

The slow part of your function is for sure the generating of the primes, not the sum function. A nice way to generate primes would be:

isprime :: (Integral i) => i -> Bool
isprime n = isprime_ n primes
  where isprime_ n (p:ps)
          | p*p > n        = True
          | n `mod` p == 0 = False
          | otherwise      = isprime_ n ps

primes :: (Integral i) => [i]
primes = 2 : filter isprime [3,5..]

I think it's very readable, though maybe a bit surprising that it works at all because it uses recursion and laziness of the primes list. It's also rather fast, though one could do further optimizations at the expense of readability.

sth
Hey, I haven't seen this particular algorithm in Haskell before. Pretty cool, it's simpler and and feels more Haskell-y than mine.
ephemient
This is actually equivalent to the inefficient algorithm. It still tests each number against each prime.
newacct
@newacc: It uses the "inefficient" algorithm, but it's *faster* then the "efficient" version, at least in this case. The "efficient" algorithm needs to do lots of map modifications and constantly iterates over bigger and bigger maps. That's not necessarily better that comparing against a few primes.
sth
Exactly. It feels just as simple and clean as the traditional `primes = sieve [2..] where sieve (p:xs) = p:[x | x <- xs, mod x p /= 0]` -- in fact, it's equivalent -- but operates more efficiently.
ephemient
PS @newacct: Sorry I misspelled your name...
sth