tags:

views:

79

answers:

3

Is it possible to disable decoding xml text content when parsing an xml file using Java? For example so " is returned as is instead of being converted to a quote

Effectively want the text content treated as if it was wrapped in a CDATA block

A: 

kXML 2 has the Options expand-entity-ref and xml-roundtrip that would allow to do this.

x4u
+1  A: 

Actually, that would be highly questionable to do. After all, these two xml snippets are exactly the same thing from the XML perspective, if " has been defined as meaning ":

<a>&quot;<b></b></a>

and

<a>"<b/></a>

And if &quot; has not been defined, the first input is not valid anyway. So, from the viewpoint of meaning, you are actually asking to get a conversion, not to avoid one.

Any parser that sees a difference between these inputs is not behaving as an XML parser. (And a program relying on seeing a difference is not really dealing with XML, it tries to deal with text files with some imposed structure.)

I'm not sure what the output should be in any case. Would you want your Java code to see a text node which has a value of "&quot;"? But the xml input for that would have been &amp;quot;, and that's also what xml output of such a text node would be.

Christopher Creutzig
A: 

What would be the harm of letting the parser replace the entity and then re-replacing later?

dsalo