views:

224

answers:

2

I want to combine the results of three zip-filter queries on an xml tree. The XML I am parsing looks like this:

<someroot>
  <publication>
    <contributors>
      <person_name>
        <surname>Surname A</surname>
      </person_name>
      <person_name>
        <given_name>Given B</given_name>
        <surname>Surname B</surname>
        <suffix>Suffix B</suffix>
      </person_name>
    </contributors>
  </publication>
</someroot>

From this example you can see that <given_name> and <suffix> are optional - only <surname> is required. Here in lies my problem - if I run three separate queries the responses I get will be out of kilter with each other:

(xml-> xml :publication :contributors :person_name :given_name text)
(xml-> xml :publication :contributors :person_name :surname text)
(xml-> xml :publication :contributors :person_name :suffix text)

After running these three queries I will be left with three sequences whose cardinalities do not match; given_name and suffix will be length 1 while surname will be length 2. This makes it impossible for me to combine the components of each name. I need to write a single query that will perform this name concatenation during sequence construction.

I'm looking at the very sparse documentation for clojure.contrib.zip-filter.xml and can't figure out how I could do this (or if it is even possible). Unfortunately I am a Clojure (and Lisp) newbie! Can anyone point out how I can write a query that will concatenate three other embedded queries?

+2  A: 

I suppose an alternative solution is to

(xml-> xml :publication :contributors :person_name)

and then process each <person_name> later on.

Karl Jonathan Ward
+3  A: 

You can get all the person subtrees in one step (xml-> xmlzip :publication :contributors :person_name) then get the name parts if they exist (xml1-> personzip :surname text) and combine them to the result you want like so:

(use 'clojure.contrib.zip-filter.xml)
(defn format-name
  [surname given suffix]
  (str surname (if given (str ", " given)) (if suffix (str ", " suffix))))
(defn get-names
  [xz] 
  (map (juxt 
         #(xml1-> % :surname text) 
         #(xml1-> % :given_name text) 
         #(xml1-> % :suffix text))
     (xml-> xz :publication :contributors :person_name)))
(let [x (clojure.zip/xml-zip (clojure.xml/parse "foo.xml"))] 
  (map (partial apply format-name) (get-names x)))

Results in ("Surname A" "Surname B, Given B, Suffix B")

Timothy Pratley