views:

162

answers:

2

For clojure's sorted-map, how do I find the entry having the key closest to a given value?

e.g. Suppose I have

(def my-map (sorted-map 
                      1 A 
                      2 B
                      5 C))

I would like a function like

(find-closest my-map 4)

which would return (5,C), since that's the entry with the closest key. I could do a linear search, but since the map is sorted, there should be a way of finding this value in something like O(log n).

I can't find anything in the API which makes this possible. If, for instance, I could ask for the i'th entry in the map, I could cobble together a function like the one I want, but I can't find any such function.

Edit:

So apparently sorted-map is based on a PersistentTreeMap class implemented in java, which is a red and black tree. So this really seems like it should be doable, at least in principle.

A: 

The first thing that comes to my mind is to pull the map's keys out into a vector and then to do a binary search in that. If there is no exact match to your key, the two pointers involved in a binary search will end up pointing to the two elements on either side of it, and you can then choose the closer one in a single (possibly tie breaking) operation.

Carl Smotricz
Since the map is already sorted, I (hopefully) shouldn't have to pull all of the keys out of the map.
Rob Lachlan
Agreed; but I don't see any other way to obtain random access to the keys. If you do a sequential search, on average you need to compare 50% of the keys, whereas my solution requires copying 100% - that's horrible - and THEN a log2(n) search. My solution is only good if you'll be doing several such searches on the same data. Maybe someone smarter will show up and post a solution that will amaze us all.
Carl Smotricz
+3  A: 

subseq and rsubseq are very fast because they exploit the tree structure:

(def m (sorted-map 1 :a, 2 :b, 5 :c)) 

(defn abs [x] (if (neg? x) (- x) x))
(defn find-closest [sm k]
  (if-let [a (key (first (rsubseq sm <= k)))]
    (if (= a k)
      a
      (if-let [b (key (first (subseq sm >= k)))]
        (if (< (abs (- k b)) (abs (- k a)))
          b
          a)))
    (key (first (subseq sm >= k)))))

user=> (find-closest m 4)
5
user=> (find-closest m 3)
2

This does slightly more work than ideal, in the ideal scenario we would just do a <= search then look at the node to the right to check if there is anything closer in that direction. You can access the tree (.tree m) but the .left and .right methods aren't public so custom traversal is not currently possible.

Timothy Pratley
+1. Thanks, that's very helpful.
Rob Lachlan