views:

205

answers:

5

This is my input data:

[[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]]

I would like to map this into the following:

{:a [[1 2] [3 4] [5 6]] :b [[\a \b] [\c \d] [\e \f]]}

This is what I have so far:

(defn- build-annotation-map [annotation & m]
 (let [gff (first annotation)
       remaining (rest annotation)
       seqname (first gff)
       current {seqname [(nth gff 3) (nth gff 4)]}]
   (if (not (seq remaining))
     m
     (let [new-m (merge-maps current m)]
       (apply build-annotation-map remaining new-m)))))

(defn- merge-maps [m & ms]
  (apply merge-with conj
         (when (first ms)                                                                                                              
           (reduce conj                     ;this is to avoid [1 2 [3 4 ... etc.                                                                                                          
                   (map (fn [k] {k []}) (keys m))))                                                                                    
         m ms))

The above produces:

{:a [[1 2] [[3 4] [5 6]]] :b [[\a \b] [[\c \d] [\e \f]]]}

It seems clear to me that the problem is in merge-maps, specifically with the function passed to merge-with (conj), but after banging my head for a while now, I'm about ready for someone to help me out.

I'm new to lisp in general, and clojure in particular, so I also appreciate comments not specifically addressing the problem, but also style, brain-dead constructs on my part, etc. Thanks!

Solution (close enough, anyway):

(group-by first [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])
=> {:a [[:a 1 2] [:a 3 4] [:a 5 6]], :b [[:b \a \b] [:b \c \d] [:b \e \f]]}
+4  A: 

Works at least on the given data set.

(defn build-annotations [coll]
  (reduce
    (fn [result vec]
      (let [key (first vec)
            val (subvec vec 1)
            old-val (get result key [])
            conjoined-val (conj old-val val)]
        (assoc
          result
          key
          conjoined-val)))
    {}
    coll))

(build-annotations [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])

I am sorry for not offering improvements on your code. I am just learning Clojure and it is easier to solve problems piece by piece instead of understanding a bigger piece of code and finding the problems in it.

ponzao
+4  A: 

Although I have no comments to your code yet, I tried it for my own and came up with this solution:

(defn build-annotations [coll]
  (let [anmap (group-by first coll)]
    (zipmap (keys anmap) (map #(vec (map (comp vec rest) %)) (vals anmap)))))
Thomas
+7  A: 
(defn build-annotations [coll]
  (reduce (fn [m [k & vs]]
            (assoc m k (conj (m k []) (vec vs))))
          {} coll))

Concerning your code, the most significant problem is naming. Firstly, I wouldn't, especially without first understanding your code, have any idea what is meant by annotation, gff, and seqname. current is pretty ambiguous too. In Clojure, remaining would generally be called more, depending on the context, and whether a more specific name should be used.

Within your let statement, gff (first annotation) remaining (rest annotation), I'd probably take advantage of destructuring, like this:

(let [[first & more] annotation] ...)

If you would rather use (rest annotation) then I'd suggest using next instead, as it will return nil if it's empty, and allow you to write (if-not remaining ...) rather than (if-not (seq remaining) ...).

user> (next [])
nil
user> (rest [])
()

In Clojure, unlike other lisps, the empty list is truthy.

This article shows the standard for idiomatic naming.

MayDaniel
Thanks for the comments, it helps a lot.
Pedro Silva
+2  A: 

Here's my entry leveraging group-by, although several steps in here are really concerned with returning vectors rather than lists. If you drop that requirement, it gets a bit simpler:

(defn f [s]
  (let [g (group-by first s)
        k (keys g)
        v (vals g)
        cleaned-v (for [group v]
                    (into [] (map (comp #(into [] %) rest) group)))]
    (zipmap k cleaned-v)))

Depending what you actually want, you might even be able to get by with just doing group-by.

Alex Miller
`group-by` does exactly what I need, actually. Thanks!
Pedro Silva
+2  A: 
(defn build-annotations [coll]
  (apply merge-with concat 
         (map (fn [[k & vals]] {k [vals]}) 
              coll))

So,

(map (fn [[k & vals]] {k [vals]}) 
     coll))

takes a collection of [keys & values] and returns a list of {key [values]}

(apply merge-with concat ...list of maps...)

takes a list of maps, merges them together, and concats the values if a key already exists.

wilkes
Thanks, that's actually my preferred solution so far. I like its conciseness, and your explanations are very clear.
Pedro Silva