views:

840

answers:

4

Say I have a vector:

(def data ["Hello" "World" "Test" "This"])

And I want to populate a table somewhere that has an api:

(defn setCell
  [row col value]
  (some code here))

Then what is the best way to get the following calls to happen:

(setCell 0 0 "Hello")
(setCell 0 1 "World")
(setCell 0 2 "Test")
(setCell 0 3 "This")

I found that the following will work:

(let [idv (map vector (iterate inc 0) data)]
  (doseq [[index value] idv] (setCell 0 index value)))

But is there a faster way that does not require a new temporary datastructure idv?

+2  A: 

The way you're doing it is idiomatic (and identical to clojure.contrib.seq-utils/indexed in fact). If you really want to avoid the extra data structure, you can do this:

(loop [data data, index 0]
  (when (seq data)
    (setCell 0 index (first data))
    (recur (rest data) (inc index))))

I'd use your version unless there was a good reason not to though.

Brian Carper
+4  A: 

You can get the same effect in a very clojure-idiomatic way by just mapping the indexes along with the data.

(map #(setCell 0 %1 %2) (iterate inc 0) data)

You may want to wrap this in a (doall or (doseq to make the calls happen now. It's just fine to map an infinite seq along with the finite one because map will stop when the shortest seq runs out.

Arthur Ulfeldt
Nice, I didn't know this behavior of map when applied to multiple collections.
pmf
+1  A: 

I did a short comparison of the performance of the options sofar:

; just some function that sums stuff 
(defn testThis
  [i value]
 (def total (+ total i value)))

; our test dataset. Make it non-lazy with doall    
(def testD (doall (range 100000)))

; time using Arthur's suggestion
(def total 0.0)
(time (doall (map #(testThis %1 %2) (iterate inc 0) testD)))
(println "Total: " total)

; time using Brian's recursive version
(def total 0.0)
(time (loop [d testD i 0]
  (when (seq d)
    (testThis i (first d))
    (recur (rest d) (inc i)))))
(println "Total: " total)

; with the idiomatic indexed version
(def total 0.0)
(time (let [idv (map vector (iterate inc 0) testD)]
  (doseq [[i value] idv] (testThis i value))))
(println "Total: " total)

Results on my 1 core laptop:

   "Elapsed time: 598.224635 msecs"
   Total:  9.9999E9
   "Elapsed time: 241.573161 msecs"
   Total:  9.9999E9
   "Elapsed time: 959.050662 msecs"
   Total:  9.9999E9

Preliminary Conclusion:

Use the loop/recur solution.

James Dean
Rich suggested when microbenchmarking to run each test a few dozen times and take the last one to get the hot-spot optimizer warmed up on the function first. testThis is much lighter weight than the map function so the tightest loop possible will be better.
Arthur Ulfeldt
I guess i need to test with hot-spot optimized.I just ran the exact same test in Python and there it ran in 80 msec.
James Dean
Calling `def` like that may be swamping your bookmarks.
Brian Carper
Write code that means what it does and does what it means. Optimize later -- at least after requirements gathered *and* complete analysis performed.
pst
I added (dotimes [n 100] ...) around my tests and got the same results it seems hotspot can not do much about these differences.Recur version takes about 230 msecs. Index vector about 690 msecs.
James Dean
+1  A: 

The nicest way would be to use clojure.contrib.seq-utils/indexed, which will look like this (using destructuring):

(doseq [[idx val] (indexed ["Hello" "World" "Test" "This"])]
  (setCell 0 idx val))
pmf