views:

19

answers:

1

I have a simple XML file like so:

<?xml version="1.0" encoding="UTF-8"?>
<foo attr="blah &#176; blah"/>

When I load it into the .NET XmlDocument and issue a Save, i.e.:

xmlDoc = New XmlDocument()
xmlDoc.Load("c:\temp\bar.xml")
xmlDoc.Save("c:\temp\bad.xml")

the new XML file contains the resolved amp 176 (a degree sign). This then breaks the final black box I'm trying to load the XML into.

I've tried playing with the encoding, to little effect. Is it possible for the parser to just echo what came in, without resolving the entities? Inerestingly, it doesn't resolve &amp;#176;

+1  A: 

XmlDocument Load unescapes the characters, also been playing around with it and cant find any easy solution on howto stop that behavior.

small hack would be doing something like this

foreach (XmlNode xn in xdoc.SelectNodes("descendant-or-self::*"))
{
  foreach(XmlAttribute attr in xn.Attributes)
  {
    string val = System.Web.HttpUtility.HtmlEncode(attr.Value);
    attr.Value = val;
  }
  if (!xn.InnerXml.Contains("<"))
  {
    string val = System.Web.HttpUtility.HtmlEncode(xn.InnerText);
    xn.InnerText = val;
  }
}

before you .Save(); thats the best I could come up with without using all week on this.

That's very interesting. In the end, I went with ° since this was preserved AND loaded correctly by my black box application.
ankh