views:

217

answers:

1

I am trying to write a simple sieve function to calculate prime numbers in clojure. I've seen this question about writing an efficient sieve function, but I am not to that point yet. Right now I am just trying to write a very simple (and slow) sieve. Here is what I have come up with:

(defn sieve [potentials primes]
  (if-let [p (first potentials)]
    (recur (filter #(not= (mod % p) 0) potentials) (conj primes p))
    primes))

For small ranges it works fine, but causes a stack overflow for large ranges:

user=> (sieve (range 2 30) [])
[2 3 5 7 11 13 17 19 23 29]
user=> (sieve (range 2 15000) [])
java.lang.StackOverflowError (NO_SOURCE_FILE:0)

I thought that by using recur this would be a non-stack-consuming looping construct? What am I missing?

+10  A: 

You're being hit by filter's laziness. Change (filter ...) to (doall (filter ...)) in your recur form and the problem should go away.

A more in-depth explanation:

The call to filter returns a lazy seq, which materialises actual elements of the filtered seq as required. As written, your code stacks filter upon filter upon filter..., adding one more level of filtering at each iteration; at some point this blows up. The solution is to force the whole result at each iteration so that the next one will do its filtering on a fully realised seq and return a fully realised seq instead of adding an extra layer of lazy seq processing; that's what doall does.

Michał Marczyk
Thanks! This fixed my problem. Excellent explanation.
dbyrne
any thoughts how to find this out? maybe something like macroexpand?
edbond
Have a look at the stack trace, I'd say. A pile of `clojure.lang.LazySeq` method calls would be a good indication that the problem is laziness-related.
Michał Marczyk