views:

234

answers:

1

Is there a good algorithm to calculate the cartesian product of three seqs concurrently in Clojure?

I'm working on a small hobby project in Clojure, mainly as a means to learn the language, and its concurrency features. In my project, I need to calculate the cartesian product of three seqs (and do something with the results).

I found the cartesian-product function in clojure.contrib.combinatorics, which works pretty well. However, the calculation of the cartesian product turns out to be the bottleneck of the program. Therefore, I'd like to perform the calculation concurrently.

Now, for the map function, there's a convenient pmap alternative that magically makes the thing concurrent. Which is cool :). Unfortunately, such a thing doesn't exist for cartesian-product. I've looked at the source code, but I can't find an easy way to make it concurrent myself.

Also, I've tried to implement an algorithm myself using map, but I guess my algorithmic skills aren't what they used to be. I managed to come up with something ugly for two seqs, but three was definitely a bridge too far.

So, does anyone know of an algorithm that's already concurrent, or one that I can parallelize myself?


EDIT

Put another way, what I'm really trying to achieve, is to achieve something similar to this Java code:

for (ClassA a : someExpensiveComputation()) {
    for (ClassB b : someOtherExpensiveComputation()) {
        for (ClassC c : andAnotherOne()) {
            // Do something interesting with a, b and c
        }
    }
}
+2  A: 

If the logic you're using to process the Cartesian product isn't somehow inherently sequential, then maybe you could just split your inputs into halves (perhaps splitting each input seq in two), calculate 8 separate Cartesian products (first-half x first-half x first-half, first-half x first-half x second-half, ...), process them and then combine the results. I'd expect this to give you quite a boost already. As for tweaking the performance of the Cartesian product building itself, I'm no expert, but I do have some ideas & observations (one needs to calculate a cross product for Project Euler sometimes), so I've tried to summarise them below.

First of all, I find the c.c.combinatorics function a bit strange in the performance department. The comments say it's taken from Knuth, I believe, so perhaps one of the following obtains: (1) it would be very performant with vectors, but the cost of vectorising the input sequences kills its performance for other sequence types; (2) this style of programming doesn't necessarily perform well in Clojure in general; (3) the cumulative overhead incurred due to some design choice (like having that local function) is large; (4) I'm missing something really important. So, while I wouldn't like to dismiss the possibility that it might be a great function to use for some use cases (determined by the total number of seqs involved, the number of elements in each seq etc.), in all my (unscientific) measurements a simple for seems to fare better.

Then there are two functions of mine, one of which is comparable to for (somewhat slower in the more interesting tests, I think, though it seems to be actually somewhat faster in others... can't say I feel prepared to make a fully educated comparison), the other apparently faster with a long initial input sequence, as it's a restricted functionality parallel version of the first one. (Details follow below.) So, timings first (do throw in the occasional (System/gc) if you care to repeat them):

;; a couple warm-up runs ellided
user> (time (last (doall (pcross (range 100) (range 100) (range 100)))))
"Elapsed time: 1130.751258 msecs"
(99 99 99)
user> (time (last (doall (cross (range 100) (range 100) (range 100)))))
"Elapsed time: 2428.642741 msecs"
(99 99 99)
user> (require '[clojure.contrib.combinatorics :as comb])
nil
user> (time (last (doall (comb/cartesian-product (range 100) (range 100) (range 100)))))
"Elapsed time: 7423.131008 msecs"
(99 99 99)
;; a second time, as no warm-up was performed earlier...
user> (time (last (doall (comb/cartesian-product (range 100) (range 100) (range 100)))))
"Elapsed time: 6596.631127 msecs"
(99 99 99)
;; umm... is syntax-quote that expensive?
user> (time (last (doall (for [x (range 100)
                               y (range 100)
                               z (range 100)]
                           `(~x ~x ~x)))))
"Elapsed time: 11029.038047 msecs"
(99 99 99)
user> (time (last (doall (for [x (range 100)
                               y (range 100)
                               z (range 100)]
                           (list x y z)))))
"Elapsed time: 2597.533138 msecs"
(99 99 99)
;; one more time...
user> (time (last (doall (for [x (range 100)
                               y (range 100)
                               z (range 100)]
                           (list x y z)))))
"Elapsed time: 2179.69127 msecs"
(99 99 99)

And now the function definitions:

(defn cross [& seqs]
  (when seqs
    (if-let [s (first seqs)]
      (if-let [ss (next seqs)]
        (for [x  s
              ys (apply cross ss)]
          (cons x ys))
        (map list s)))))

(defn pcross [s1 s2 s3]
  (when (and (first s1)
             (first s2)
             (first s3))
    (let [l1 (count s1)
          [half1 half2] (split-at (quot l1 2) s1)
          s2xs3 (cross s2 s3)
          f1 (future (for [x half1 yz s2xs3] (cons x yz)))
          f2 (future (for [x half2 yz s2xs3] (cons x yz)))]
      (concat @f1 @f2))))

I believe that all versions produce the same results. pcross could be extended to handle more sequences or be more sophisticated in the way it splits its workload, but that's what I came up with as a first approximation... If you do test this out with your programme (perhaps adapting it to your needs, of course), I'd be very curious to know the results.

Michał Marczyk
Thanks for your excellent answer! Had to make a small adjustment in pcross because I'm interfacing with Java objects and I couldn't call `count` on them. In general, execution times are like this in my project: `pcross < cartesian-product < cross`. The difference with pcross isn't as big as I hoped, but that seems mostly due to my the way my app is written. I may have misjudged the performance characteristics a little bit. I still have a lot to learn about Clojure, for sure :).
jqno