tags:

views:

868

answers:

2

I have an ASPX page that creates an XMLDocument object from SQL data and then transforms it into another XML document (RSS feed) using an XSLT file with XPathNavigator and XslCompiledTransform. Occasionally the data will contain smart quotes (\u2019) which results in an error (Unable to translate Unicode character \u2019 at index 947 to specified code page). I'm not sure how all the encoding settings work, but is there a way to prevent this without having to check for these types of characters in all the data as I'm creating the XML attributes?

My XSLT file looks like this...

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

    <xsl:output method="xml" version="1.0" encoding="iso-8859-1"/>

I've tried changing the xsl:output encoding to utf-8 and utf-16 but still get the same problem. Any ideas?

Here's my code if that helps...

XmlDocument xdoc = new XmlDocument();
XmlNode xnode = requests.XMLNode(xdoc, imageType, Request, promotionPageId, eventPageId);
xdoc.AppendChild(xnode);

Response.Clear();
Response.ContentType = "text/xml";
Response.AddHeader("Content-Type", "text/xml");

if (xsltFile != string.Empty)
{
    XPathNavigator xnav = xdoc.CreateNavigator();
    XslCompiledTransform xslTransform = new XslCompiledTransform();
    xslTransform.Load(Server.MapPath(string.Format("~/xslt/{0}.xslt", xsltFile)));
    xslTransform.OutputSettings.Encoding.
    xslTransform.Transform(xnav, null, Response.OutputStream);
}
else
{
    xdoc.Save(Response.OutputStream);
}

Response.End();
A: 

What's the document encoding of the input XML your XSL is working on? You should be able to set that, then the XSL will know what to expect.

AmbroseChapel
+1  A: 

Your transform is working fine. The problem is that the transform is emitting a character that isn't supported by the content encoding of the output stream. Set the ContentEncoding on the HttpResponse to Encoding.UTF16 and this problem should go away.

Robert Rossney
... or `Encoding.UTF8`, which is usually preferred because it is smaller for Latin-based writing (including English). (I’m amazed this isn’t already the default — it should be!)
Timwi