views:

291

answers:

4

As a neophyte clojurian, it was recommended to me that I go through the Project Euler problems as a way to learn the language. Its definitely a great way to improve your skills and gain confidence. I just finished up my answer to problem #14. It works fine, but to get it running efficiently I had to implement some memoization. I couldn't use the prepackaged memoize function because of the way my code was structured, and I think it was a good experience to roll my own anyways. My question is if there is a good way to encapsulate my cache within the function itself, or if I have to define an external cache like I have done. Also, any tips to make my code more idiomatic would be appreciated.

(use 'clojure.test)

(def mem (atom {}))

(with-test
  (defn chain-length      
    ([x] (chain-length x x 0))     
    ([start-val x c]
      (if-let [e (last(find @mem x))]
        (let [ret (+ c e)]
          (swap! mem assoc start-val ret)
          ret)   
        (if (<= x 1)               
          (let [ret (+ c 1)]
            (swap! mem assoc start-val ret)
            ret)                  
          (if (even? x)            
            (recur start-val (/ x 2) (+ c 1))
            (recur start-val (+ 1 (* x 3)) (+ c 1)))))))
  (is (= 10 (chain-length 13))))

(with-test
  (defn longest-chain
    ([] (longest-chain 2 0 0))
    ([c max start-num]
      (if (>= c 1000000)
        start-num
        (let [l (chain-length c)]
          (if (> l max)
            (recur (+ 1 c) l c)
            (recur (+ 1 c) max start-num))))))
  (is (= 837799 (longest-chain))))
+3  A: 

Since you want the cache to be shared between all invocations of chain-length, you would write chain-length as (let [mem (atom {})] (defn chain-length ...)) so that it would only be visible to chain-length.

In this case, since the longest chain is sufficiently small, you could define chain-length using the naive recursive method and use Clojure's builtin memoize function on that.

Brian
Thanks! Now that you've explained it, it seems obvious. The idea of using naive recursion and memoize crossed my mind but I thought this would be a better learning experience and more "scalable" to boot.
dbyrne
+1  A: 

You can capture the surrounding environment in a clojure :

(defn my-memoize [f] 
  (let [cache (atom {})] 
    (fn [x] 
      (let [cy (get @cache x)] 
        (if (nil? cy) 
          (let [fx (f x)] 
          (reset! cache (assoc @cache x fx)) fx) cy)))))


(defn mul2 [x] (do (print "Hello") (* 2 x)))
(def mmul2 (my-memoize mul2))  
user=> (mmul2 2)
Hello4
user=>  (mmul2 2) 
4

You see the mul2 funciton is only called once.

So the 'cache' is captured by the clojure and can be used to store the values.

Peter Tillemans
This is essentially the way the built-in memoize function works correct? I don't think this would work with the way I currently have my functions written because the recursive call will always have different parameters, and a cached value will never be returned.
dbyrne
This is not really an issue since only one of the parameters need to be used. If x is already known you can add the cached count for that to the state parameter.
Peter Tillemans
+1  A: 

Here's an idiomatic(?) version using plain old memoize.

(def chain-length
     (memoize
      (fn [n]
        (cond
         (== n 1)  1
         (even? n) (inc (chain-length (/ n 2)))
         :else     (inc (chain-length (inc (* 3 n))))))))

(defn longest-chain [start end]
  (reduce (fn [x y]
            (if (> (second x) (second y)) x y))
          (for [n (range start (inc end))]
            [n (chain-length n)])))

If you have an urge to use recur, consider map or reduce first. They often do what you want, and sometimes do it better/faster, since they take advantage of chunked seqs.

(inc x) is like (+ 1 x), but inc is about twice as fast.

Brian Carper
I tried your version, but (longest-chain 2 1000000) causes a StackOverflowError. The one million is from the Project Euler exercise #14.
Michiel Borkent
Try `(longest-chain 1 1000000)` and it should work. It has to cache the value of 1 first.
Brian Carper
The same error. I would be surprised if that solved it, because with (chain-length 2), it calls (chain-length 1) in the cond, and that one would be cached also. It works on your machine I presume? I tried both Clojure versions 1.1 and 1.2.
Michiel Borkent
Yeah, you're right, duh. Hmm. It does work on my machine though, with Clojure 1.2.0 bleeding edge. Then I tried `(longest-chain 10000000)` and got an OutOfMemoryError error ("GC overhead limit exceeded") before I got a stack overflow. That's the bad thing about recursion + memoize. You have to pray values are cached fast enough that by the time you reach the millionth iteration, it has enough memoized partial results not to kill the stack. You're always flirting with disaster up to that point. Could be that my JVM has more memory to play with than yours, or who knows.
Brian Carper
My version of the code required an enlarged JVM size to run for 1 million
dbyrne
A: 

I think that the most straigtforward way is to assign a new var with our memoized function, like this:

(defn step [n]
  (if (odd? n)
    (inc (* 3 n))
    (/ n 2)))

(def memo-step (memoize step))

(defn chain [num]
  (loop [n num
         times 1]
    (if-not (<= n 1)
      (recur (memo-step n) (inc times))
      [num times])))


(defn get-answer [start end]
  (reduce (fn [a b]
            (if (> (second b) (second a))
              b
              a)) (map chain (range start end))))

However for this particular problem it appears that memoization slows down the computation a lot.

krdx