views:

266

answers:

2

I have code that is passed a string containing XML. This XML may contain one or more instances of   (an entity reference for the blank space character). I have a requirement that these references should not be resolved (i.e. they should not be replaced with an actual space character).

Is there any way for me to achieve this?

Basically, given a string containing the XML:

<pattern value="[A-Z0-9&#x20;]" />

I do not want it to be converted to:

<pattern value="[A-Z0-9 ]" />

(What I am actually trying to achieve is to simply take an XML string and write it to a "pretty-printed" file. This is having the side-effect of resolving occurrences of &#x20; in the string to a single space character, which need to be preserved. The reason for this requirement is that the written XML document must conform to an externally-defined specification.)

I have tried creating a sub-class of XmlTextReader to read from the XML string and overriding the ResolveEntity() method, but this isn't called. I have also tried assigning a custom XmlResolver.

I have also tried, as suggested, to "double encode". Unfortunately, this has not had the desired effect, as the &amp; is not decoded by the parser. Here is the code I used:

string schemaText = @"...<pattern value=""[A-Z0-9&#x26;#x20;]"" />...";
XmlWriterSettings writerSettings = new XmlWriterSettings();
writerSettings.Indent = true;
writerSettings.NewLineChars = Environment.NewLine;
writerSettings.Encoding = Encoding.Unicode;
writerSettings.CloseOutput = true;
writerSettings.OmitXmlDeclaration = false;
writerSettings.IndentChars = "\t";

StringBuilder writtenSchema = new StringBuilder();
using ( StringReader sr = new StringReader( schemaText ) )
using ( XmlReader reader = XmlReader.Create( sr ) )
using ( TextWriter tr = new StringWriter( writtenSchema ) )
using ( XmlWriter writer = XmlWriter.Create( tr, writerSettings ) )
{
   XPathDocument doc = new XPathDocument( reader );
   XPathNavigator nav = doc.CreateNavigator();

   nav.WriteSubtree( writer );
}

The written XML ends up with:

<pattern value="[A-Z0-9&amp;#x20;]" />
+1  A: 

If you want it to be preserved, you need to double-encode it: &amp;#x20;. The XML-reader will translate entities, that's more or less how XML works.

Williham Totland
A: 
<pattern value="[A-Z0-9&#x26;#x20;]" />

What I did above is replaced "&" with "&#x26;" thereby escaping the ampersand.

Delan Azabani
Williham Totland
Delan Azabani
Thanks for your responses guys. Unfortunately, I have not been able to get it to work. I have updated the Q - maybe you could have a look and point out where I am going wrong? Thanks
Zoodor