views:

119

answers:

3

I have a long, lazy sequence that I want to reduce and test lazily. As soon as two sequential elements are not = (or some other predicate) to each other, I want to stop consuming the list, which is expensive to produce. Yes, this sounds like take-while, but read further.

I wanted to write something simple and elegant like this (pretending for a minute that every? works like reduce):

(every? = (range 100000000))

But that does not work lazily and so it hangs on infinite seqs. I discovered that this works almost as I wanted:

(apply = (range 100000000))

However, I noticed that sequence chunking was resulting in extra, unnecessary elements being created and tested. At least, this is what I think this is what happening in the following bit of code:

;; Displays chunking behavior in groups of four on my system and prints 1 2 3 4
(apply = (map #(do (println %) %) (iterate inc 1)))

;; This prints 0 to 31
(apply = (map #(do (println %) %) (range)))

I found a workaround using take-while, and count to check the number of elements taken, but that is rather cumbersome.

Should I politely suggest to Rich Hickey that he make some combination of reduce and every? short circuit properly, or am I missing some obvious way that already exists?

EDIT: Two kind people posted solutions for avoiding chunking on the lazy sequences, but how do I avoid chunking when doing the apply, which seems to be consuming in chunked groups of four?

EDIT #2: As Stuart Sierra notes and I discovered independently, this isn't actually chunking. It's just apply acting normally, so I'll call this closed and give him the answer. I included a small function in a separate answer to do the reduce'ing part of the problem, for those who are interested.

+4  A: 

Look at this blog post for a potential solution: http://blog.fogus.me/2010/01/22/de-chunkifying-sequences-in-clojure/

fogus
form the article above: "Clojure will one day provide an official version of the function above," WHEN WHEN WHEN!!!!????
Arthur Ulfeldt
You're asking the wrong fella.
fogus
The seq1 didn't work on Clojure 1.2 without some modification, and even when I got it working I still see the 1 to 4 chunking behavior on the (apply = ..), as with Stuart's solution below. Any other ideas?
ivar
+5  A: 

CORRECTED TWICE: A simpler way to un-chunk a lazy sequence:

(defn unchunk [s]
  (when (seq s)
    (lazy-seq
      (cons (first s)
            (unchunk (next s))))))

First version omitted (when ... so it returned an infinite seq of nil's after the input sequence ended.

Second version used first instead of seq so it stopped on nil.

RE: your other question, "how do I avoid chunking when doing the apply, which seems to be consuming in chunked groups of four":

This is due to the definition of =, which, when given a sequence of arguments, forces the first 4:

(defn =
  ;; ... other arities ...
  ([x y & more]
   (if (= x y)
 (if (next more)
   (recur y (first more) (next more))
   (= y (first more)))
 false)))
Stuart Sierra
That is better, but I'm still getting chunking (albeit only 4) when I do this:(apply = (map #(do (println %) %) (unchunk (range))))
ivar
This is not chunking. Compare `(first (map #(doto % println) (unchunk (range))))` to `(first (map #(doto % println) (range)))`. This is something with `apply` and/or `=`. Also note: you might want to use `(defn unchunk [s] (when-let [s (seq s)] (cons (first s) (unchunk (rest s)))))`, see `(unchunk [1 2 3])`.
kotarak
kotorak- Fixed the missing `when`. Now your examples work correctly.
Stuart Sierra
Thanks to @chrishouser for correcting my mistakes.
Stuart Sierra
A: 

Looking in clojure.core at the definition of apply makes it obvious as to why it was chunking in groups of four when apply is used with an infinite sequence. Reduce doesn't short circuit, either...so I'm left writing my own solution:

(defn reducep
  "Like reduce, but for use with a predicate. Short-circuits on first false."
  ([p coll]
     (if-let [s (seq coll)]
       (reducep p (first s) (next s))
       (p)))
  ([p val coll]
     (if-let [s (seq coll)]
       (if-let [v (p val (first s))]
         (recur p (first s) (next s))
         false)
       true)))

Then using Stuart's unchunk (with an extra and)

(defn unchunk [s]
  (lazy-seq
   (cons (first s)
         (and (next s)
              (unchunk (next s))))))

I get:

(reducep = (map #(do (print %) %) (unchunk (range)))) ;; Prints 01, returns false
(reducep = (map #(do (print %) %) (repeat 20 1))) ;; returns true
(reducep = (map #(do (print %) %) (unchunk [0 0 2 4 5]))) ;; Returns false
(reducep = (map #(do (print %) %) (unchunk [2 2 2 2 2]))) ;; returns true

If that works for you too, mod this up.

EDIT: Stuart's modified version of unchunk after his edit is probably preferable to the one in this earlier post.

ivar