Currently I'm writing XHTML in a XmlDocument. This works perfect, but I'm stuck on one problem. Some XmlText elements can contain things like . When I want to write such things to a stream it uses the innerXML instead of the innerText value for such nodes. The problem is that the ouput is wrong because now its outputting   instead of . How can I use xmlwriter and xmldocument without performing such escaping when writing to a stream? I just want unescaped output.
If you use XmlWriter.WriteRaw
, it won't perform any escaping - it assumes you've got raw XML.
For example:
using System;
using System.Xml;
class Test
{
static void Main()
{
using (XmlWriter writer = XmlWriter.Create(Console.Out))
{
writer.WriteStartDocument();
writer.WriteStartElement("root");
writer.WriteRaw("<element> </element>");
writer.WriteEndElement();
writer.WriteEndDocument();
}
}
}
Output:
<?xml version="1.0" encoding="IBM437"?><root><element> </element></root>
You're almost certainly trying to solve the wrong problem here. If you want text with non-breaking spaces, then you should use the non-breaking space character. In a C# string literal you can write it as the escape sequence \u00A0
, for example:
var xmldoc = new XmlDocument();
XmlElement test = xmldoc.CreateElement("test");
xmldoc.AppendChild(test);
XmlText nbsp = xmldoc.CreateTextNode("\u00A0");
test.AppendChild(nbsp);
HTML entities like nbsp
are just a way to encode such characters in a non-unicode text file. You shouldn't be using them when constructing an XML DOM. By the way, if you force .NET to write the above DOM to an ASCII encoded file (via the proper XmlWriterSettings) then it will probably write the non-breaking space character as  
. In an UTF-8 encoded file (the default) it will just appear as a space.
If you force certain literal character sequences to appear in the XML output, then you risk creating invalid XML that cannot be loaded by conforming XML processors. For example, try to load <test>
</test>
in an empty XmlDocument
. This will throw an exception. To be fair, you can declare such entities, and the XHTML schema does so. But I hope you see my point.
edit: XmlDocument is doing it's job correctly. If it wouldn't escape characters such as & < > then you could create invalid XML that's impossible to load again. To force an XML entity in the output you should use XmlDocument.CreateEntityReference. The bug is in whatever code is using entities in XmlText nodes instead of generating XmlEntityReference nodes.