views:

150

answers:

1

Here's my implementation of Sieve of Erathosthene in Clojure (based on SICP lesson on streams):

(defn nats-from [n]
  (iterate inc n))

(defn divide? [p q]
  (zero? (rem q p)))

(defn sieve [stream]
  (lazy-seq (cons (first stream)
            (sieve (remove #(divide? (first stream) %) 
               (rest stream))))))

(def primes (sieve (nats-from 2)))

Now, it's all OK when i take first 100 primes:

(take 100 primes)

But, if i try to take first 1000 primes, program breaks because of stack overflow. I'm wondering if is it possible to change somehow function sieve to become tail-recursive and, still, to preserve "streamnes" of algorithm?

Any help???

+3  A: 

Firstly, this is not the Sieve of Eratosthenes... see my comment for details.

Secondly, apologies for the close vote, as your question is not an actual duplicate of the one I pointed to... My bad.

Explanation of what is happening

The difference lies of course in the fact that you are trying to build an incremental sieve, where the range over which the remove call works is infinite and thus it's impossible to just wrap a doall around it. The solution is to implement one of the "real" incremental SoEs from the paper I seem to link to pretty frequently these days -- Melissa E. O'Neill's The Genuine Sieve of Eratosthenes.

A particularly beatiful Clojure sieve implementation of this sort has been written by Christophe Grand and is available here for the admiration of all who might be interested. Highly recommended reading.

As for the source of the issue, the questions I originally thought yours was a duplicate of contain explanations which should be useful to you: see here and here. Once again, sorry for the rash vote to close.

Why tail recursion won't help

Since the question specifically mentions making the sieving function tail-recursive as a possible solution, I thought I would address that here: functions which transform lazy sequences should not, in general, be tail recursive.

This is quite an important point to keep in mind and one which trips up many an unexperienced Clojure (or Haskell) programmer. The reason is that a tail recursive function of necessity only returns its value once it is "ready" -- at the very end of the computation. (An iterative process can, at the end of any particular iteration, either return a value or continue on to the next iteration.) In constrast, a function which generates a lazy sequence should immediately return a lazy sequence object which encapsulates bits of code which can be asked to produce the head or tail of the sequence whenever that's desired.

Thus the answer to the problem of stacking lazy transformations is not to make anything tail recursive, but to merge the transformations. In this particular case, the best performance can be obtained by using a custom scheme to fuse the filtering operations, based on priority queues or maps (see the aforementioned article for details).

Michał Marczyk
As a side note, the SICP sieve blows up at some point too, for the same reasons, although some Scheme implementations might have a deeper stack by default. That's not altogether unavoidable in and of itself, but with the stream operations implemented as they usually are, it's simpler to switch to a different sieving scheme if you might need to generate a lot of primes.
Michał Marczyk