ansaurus

Question

How prevent System.Xml.XmlDocument from escaping attributes values

Answer 1

+3 A:

Why do you need it not to apply that escaping?

Any normal parser should then apply the appropriate "unescaping" when it parses it. It sounds like you're trying to test the resulting XML document as a plain-text document, which is rarely a good idea. XML documents should almost always be fed to XML parsers in the next step, at which point this isn't an issue.

I don't know of any way of preventing the .NET XML libraries from doing this, and I'd be somewhat surprised if they had such a facility.

Jon Skeet 2009-11-12 11:10:45

I'm indeed reading the xml file in a text editor (it's supposed to be human readable, isn't it?)Well, it's then possible I'm seeing a problem where there's not at all. Thanks for your answer.

Vinzz 2009-11-12 11:16:00

@Vinzz: Yes, XML is supposed to be human-comprehensible. But it's still *not* supposed to be treated as plain text. Don't let the fact that you can open it in a text editor distract you.

Tomalak 2009-11-12 12:15:21

Answer 2

+3 A:

Which is the very thing I'd want to prevent.

Really? It isn't generally important at all whether that escaping is applied; the XML infoset for either is the same.

I am frankly a bit surprised that the document loads at all.

> is a perfectly valid character to include in an attribute value. The only place > may need to be &-escaped in XML is in a ]]> sequence in text content, due to an obscure and silly rule in the spec.

To avoid having to think about the problem, many XML serialisers habitually escape > anywhere in text content or attribute values.

The Canonical XML specification specifies one particular way of serialising an XML document so the output can be compared as a simple string; for example it states exactly how attributes should be ordered. Canonical XML endorses >-escaping in text content, but it denies it in attribute values. So if you used a Canonical XML serialiser to output your document you'd get the result you expected for that particular value. (I can't guarantee it'd look how you want for other examples though.)

You can get a canonicaliser in .NET using XmlDsigC14NTransform (or maybe XmlDsigC14NWithCommentsTransform), something like:

XmlDsigC14NTransform transform= new XmlDsigC14NTransform(false);
transform.LoadInput(doc);
Stream stream= (Stream) t.GetOutput(typeof(Stream));
// write stream to file

bobince 2009-11-12 11:37:49

ansaurus

tags:

views:

answers:

How prevent System.Xml.XmlDocument from escaping attributes values

related questions