views:

456

answers:

4

Given the following...

(def inTree
 '((1 2)
   (1 2 3)
   (1 2 4 5 9)
   (1 2 4 10 15)
   (1 2 4 20 25)))

How would you transform it to this trie?

(def outTrie
 '(1
    (2 ()
       (3 ())
       (4 (5
            (9 ()))
          (10
            (15 ()))
          (20
            (25 ()))))))
+1  A: 

As a general approach, here's what I would do:

  • Write a few functions to create a trie and to insert new elements into a trie.
  • Create a new trie.
  • Iterate through the input list and insert each element into the trie.

This problem lends itself very well to a recursive implementation. I would aim for that if possible.

Greg Hewgill
+1  A: 

I'm sure there is a prettier way (there was! see Brian's answer it is better):

(defn find-in-trie
  "Finds a sub trie that matches an item, eg:
  user=> (find-in-trie '(1 (2) (3 (2))) 3)
  (3 (2))"
  [tr item]
  (first (for [ll (rest tr) :when (= (first ll) item)] ll)))


(defn add-to-trie
  "Returns a new trie, the result of adding se to tr, eg:
  user=> (add-to-trie nil '(1 2))
  (1 (2))"
  [tr se]
  (cond
    (empty? se) tr
    (empty? tr) (add-to-trie (list (first se)) (rest se))
    :else (if-let [st (find-in-trie tr (first se))]
            (cons (first tr)
                  (cons (add-to-trie st (rest se))
                        (filter (partial not= st) (rest tr))))
            (cons (first tr)
                  (cons (add-to-trie (list (first se)) (rest se))
                        (rest tr))))))

(def in '((1 2)
          (1 2 3)
          (1 2 4 5 9)
          (1 2 4 10 15)
          (1 2 4 20 25)))

(reduce add-to-trie '(nil) in)

-> (nil (1 (2 (4 (20 (25)) (10 (15)) (5 (9))) (3))))

Note that I've chosen to use nil as the root node and have not bothered keeping empty lists to signify no children. Actually doing it this way is not correct as it does not preserve substring identity.

Timothy Pratley
Thanks. It's helps to see code for common problems to discover the idioms of a language.
Johnny
No worries, see Brian's answer it is more idiomatic and correct.
Timothy Pratley
+6  A: 

Lists are very clumsy here, not to mention inefficient. In Clojure it's more idiomatic to use vectors and hash-maps and sets when appropriate. Using hash-maps:

(def in-tree
 '((1 2)
   (1 2 3)
   (1 2 4 5 9)
   (1 2 4 10 15)
   (1 2 4 20 25)))

(defn add-to-trie [trie x]
  (assoc-in trie x {:terminal true}))

(defn in-trie? [trie x]
  (get-in trie `(~@x :terminal)))

If you wanted it to print sorted you could use sorted-maps instead, but you'd have to write your own version of assoc-in that used sorted maps the whole way down. In any case:

user> (def trie (reduce add-to-trie {} in-tree))
#'user/trie
user> trie
{1 {2 {4 {20 {25 {:terminal true}}, 10 {15 {:terminal true}}, 5 {9 {:terminal true}}}, 3 {:terminal true}, :terminal true}}}
user> (in-trie? trie '(1 2))
true
user> (in-trie? trie '(1 2 4))
nil
user> (in-trie? trie '(1 2 4 20 25))
true
Brian Carper
Great answer and highlight that mine was actually incorrectly ignoring the substring issue. I would suggest a slightly different in-tri?:(defn in-trie? [trie x] (:terminal (get-in trie x) false))user=> (in-trie? trie '(1 2 4))falseMakes it a true predicate and avoids splicing syntax.
Timothy Pratley
Very nice indeed.
Johnny
+2  A: 

Here's a cleaned up solution. This fixes a bug Brian's add-to-trie method since it's currently dependent upon you inserting the seqs in increasing-length order. It also allows querying the trie by prefix, which is a common use case.

Note the memory usage here is higher since it stores the values in the leaf nodes of the trie so you can perform searches.

(defn add-to-trie [trie x]
  (assoc-in trie x (merge (get-in trie x) {:val x :terminal true})))

(defn in-trie? [trie x]
  "Returns true if the value x exists in the specified trie."
  (:terminal (get-in trie x) false))

(defn prefix-matches [trie prefix]
  "Returns a list of matches with the prefix specified in the trie specified."
  (map-filter :val (tree-seq map? vals (get-in trie prefix))))

(defn build-trie [coll]
  "Builds a trie over the values in the specified seq coll."
  (reduce add-to-trie {} coll))
Greg Fodor
So Brian's version would be fine I guess if you always used the same number of keys?
Johnny