tags:

views:

389

answers:

1

Below roundtrip produces invaild xml as the result is not escaped correctly, i.e. the attribute values contain ' instead of apos;. Am I doing somthing wrong or is this a bug?

(ns xml-test
  (:require [clojure.xml :as xml])
  (:require [clojure.zip :as zip]))

(def test-xml "<?xml version="1.0" encoding="UTF-8"?> <main> <item attr='&apos;test&apos;'> </item> </main>")

(def s (ByteArrayInputStream. (.getBytes test-xml "UTF-8")))

(xml/emit (zip/root (zip/xml-zip (clojure.xml/parse s))))

output:

<?xml version='1.0' encoding='UTF-8'?>
<main>
<item attr=''test''/>
</main>
nil
+7  A: 

I've checked the source quickly and clojure.xml/emit-element (which gets called by clojure.xml/emit) makes no effort whatever to encode any characters as XML entities; in fact, it lets attribute values straight through. I guess this means clojure.xml is quite limited in its usability; you should use clojure.contrib.lazy-xml instead. My apologies for not mentioning it in the answer to your first question on XML emitting, I didn't realise stuff like this would happen.

With clojure.contrib.lazy-xml, you can do the following:

user> (lazy-xml/emit
       (lazy-xml/parse-trim
        (java.io.StringReader. "<foo bar=\"&apos;&quot;&quot;&apos;\"/>")))
<?xml version="1.0" encoding="UTF-8"?><foo bar="'&quot;&quot;'"/>

If you really wanted to use clojure.xml, you'd have to pass on clojure.xml/emit and use an XML producer of your choice instead. Well, actually, you can use clojure.xml/parse, mangle the result, then pass it to clojure.contrib.lazy-xml/emit; the structure of the Clojure representation of the XML is the same with both libraries, but only the latter does proper emitting.

Michał Marczyk