ansaurus

Question

Trouble with lazy convolution fn in Clojure

Answer 1

+3 A:

The likely cause of the stack overflow errors is that the lazy thunks are getting too deep. (concat and map are lazy). Try wrapping those calls in doall to force evaluation of their return values.

As for a more functional solution, try something like this:

(defn circular-convolve
  "Perform a circular convolution of vectors f and g"
  [f g]
  (letfn [(point-mul [m n]
        (* (f m) (g (mod (- n m) (count g)))))
      (value-at [n]
        (reduce + (map #(point-mul % n) (range (count g)))))]
    (map value-at (range (count g)))))

Use can use reduce to perform summation easily, and since map produces a lazy sequence, this function is also lazy.

Brian 2010-07-16 01:20:47

Unfortunately, wrapping my `maps` and `concats` in `doalls` makes it take an unacceptably long time (I'm not sure it will finish!) to calculate a convolution with one array havign 100,000 points and the other having 2... Your circular convolution doesn't return lists of correct length, or correct values (I don't believe!) Thank you for the help, though!

Isaac Hodes 2010-07-16 13:19:22

Answer 2

+2 A:

Can't help with a high-performance functional version, but you can get a 100-fold speedup for the imperative version by foregoing laziness and adding type hints:

(defn imp-convolve-2 [xs is]
  (let [^doubles xs (into-array Double/TYPE xs)
        ^doubles is (into-array Double/TYPE is)
        ys (double-array (dec (+ (count xs) (count is)))) ]
    (dotimes [i (alength xs)]
      (dotimes [j (alength is)]
        (aset ys (+ i j)
          (+ (* (aget xs i) (aget is j))
            (aget ys (+ i j))))))
    ys))

With xs size 100k and is size 2, your imp-convolve-1 takes ~6,000ms on my machine when wrapped in a doall, while this one takes ~35ms.

Update

Here is a lazy functional version:

(defn convolve 
  ([xs is] (convolve xs is []))
  ([xs is parts]
    (cond
      (and (empty? xs) (empty? parts)) nil
      (empty? xs) (cons
                    (reduce + (map first parts))
                    (convolve xs is
                      (remove empty? (map rest parts))))
      :else (cons
              (+ (* (first xs) (first is))
                (reduce + (map first parts)))
              (lazy-seq
                (convolve (rest xs) is
                  (cons 
                    (map (partial * (first xs)) (rest is))
                    (remove empty? (map rest parts)))))))))

On sizes 100k and 2, it clocks in at ~600ms (varying 450-750ms) vs ~6,000ms for imp-convolve-1 and ~35ms for imp-convolve-2.

So it's functional, lazy and has tolerable performance. Still, it's twice as much code as the imperative version and took me 1-2 additional hours to find, so I'm not sure that I see the point.

I'm all for pure functions when they make the code shorter or simpler, or have some other benefit over an imperative version. When they don't, I have no objection to switch to imperative mode.

Which is one of the reasons I think Clojure is great, since you can use either approach as you see fit.

Update 2:

I'll amend my "what's the point of doing this functionally" by saying that I like this functional implementation (the second one, further down the page) by David Cabana.

It's brief, readable and times to ~140ms with the same input sizes as above (100,000 and 2), making it by far the best-performing functional implementation of those I tried.

Considering that it is functional (but not lazy), uses no type hints and works for all numeric types (not just doubles), that's quite impressive.

j-g-faustus 2010-07-16 20:30:30

I appreciate that-I have virtually no experience doing imperative Clojure! Perhaps I should learn... Thanks! (Not accepting, as still looking for a functional version, but have my +1)

Isaac Hodes 2010-07-16 20:45:10

Though strangely enough, when I try it with 1 million points, I get an "error in process filter: Wrong number of arguments: nil, 0". Though no Java errors I can see... Works fine for 100k, though. My Python version works for 1 million - haven't tried it with more.

Isaac Hodes 2010-07-16 20:57:12

It should work for up to Integer/MAX_VALUE (about 2 billion) points, it's the max size of a Java array.I'm not seeing anything different for 1 million points. Sure it doesn't happen elsewhere? "Error in process filter" doesn't sound like it comes from the convolve function.

j-g-faustus 2010-07-16 21:18:04

I'll look into it some more: I also hit a Java error: OutOfMemory, or something like that. Two billion would be sufficient, I suppose, but it'd be nice to be able to use a long and have virtually no limit: there a way to do that?

Isaac Hodes 2010-07-17 02:48:58

With two billion doubles at 8 bytes each, you need 16GB RAM just to hold the doubles. At that scale you would probably want to look into chunking (process N numbers from xs at a time, gluing the results together at the boundaries afterwards) or laziness. I don't know of a simple, high-performance and functional way, I guess I'll have to leave that to the experts. For myself, I'm inclined to think that high-performance number crunching is one of those things that works best with mutability, if nothing else because you don't need to reinvent every algorithm you find in the literature.

j-g-faustus 2010-07-17 04:33:57

I forgot my rule of thumb: think before speaking. Two billion doubles is of course sufficient. Even if it weren't, I would be using the FFT to convolve the signals in the actual application. I am still concerned that there isn't a decent functional solution, though.

Isaac Hodes 2010-07-17 05:11:24

Answer 3

+2 A:

(defn ^{:static true} convolve ^doubles [^doubles xs ^doubles is]
  (let [xlen (count xs)
        ilen (count is)
        ys   (double-array (dec (+ xlen ilen)))]
    (dotimes [p xlen]
      (dotimes [q ilen]
        (let [n (+ p q), x (aget xs p), i (aget is q), y (aget ys n)]
          (aset ys n (+ (* x i) y)))))
    ys))

Riffing on j-g-faustus's version if I was doing this in the Clojure equiv branch. Works for me. ~400ms for 1,000,000 points, ~25ms for 100,000 on a i7 Mackbook Pro.

dnolen 2010-07-17 16:38:07

Answer 4

+2 A:

(defn convolve [xs is]
  (if (> (count xs) (count is))
    (convolve is xs)
    (let [os (dec (+ (count xs) (count is)))
          lxs (count xs)
          lis (count is)]
      (for [i (range os)]
        (let [[start-x end-x] [(- lxs (min lxs (- os i))) (min i (dec lxs))]
              [start-i end-i] [(- lis (min lis (- os i))) (min i (dec lis))]]
          (reduce + (map *
                         (rseq (subvec xs start-x (inc end-x)))
                         (subvec is start-i (inc end-i)))))))))

It is possible to express a lazy, functional solution in concise terms. Alas, the performance for > 2k is impractical. I'm interested to see if there are ways to speed it up without sacrificing readability.

Edit: After reading drcabana's informative post on the topic (http://erl.nfshost.com/2010/07/17/discrete-convolution-of-finite-vectors/), I've updated my code to accept different sized vectors. His implementation is better performing: for xs size 3, is size 1000000, ~2s vs ~3s

Edit 2: Taking drcabana's ideas of simply reversing xs and padding is, I arrived at:

(defn convolve [xs is]
  (if (> (count xs) (count is))
    (convolve is xs)
    (let [is (concat (repeat (dec (count xs)) 0) is)]
      (for [s (take-while not-empty (iterate rest is))]
         (reduce + (map * (rseq xs) s))))))

This is probably as concise as it's going to get, but it is still slower overall, likely due to take-while. Kudos to the blog author for a well considered approach. The only advantage here is that the above is truly lazy in that if I ask (nth res 10000), it will only need the first 10k calculations to arrive at a result.

Joseph Gay 2010-07-17 19:02:24

ansaurus

tags:

views:

answers:

Trouble with lazy convolution fn in Clojure

related questions