I am having trouble transforming particular characters from an XML feed into XHTML.
I am using the following example to demonstrate the problem.
Here is my XML file:
<?xml version="1.0" encoding="UTF-8"?>
<paragraph>some text including the –, ã and ’ characters</paragraph>
Here is the XSLT I am applying:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"
encoding="UTF-8"
indent="yes"
doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" />
<xsl:template match="paragraph">
<html xmlns="http://www.w3.org/1999/xhtml">
<head></head>
<body>
<p><xsl:apply-templates/></p>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Here is the resultant XHTML:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head></head>
<body>
<p>some text including the –, ã and ’ characters</p>
</body>
</html>
The characters from the original XML are being replaced with new ones.
Firstly I want to check whether there is something wrong with my encoding which causes this issue?
Am I supposed to do something using entities if I want to map the special characters to display correctly in XHTML? If so how do I use these within an XSLT and do I need to know every single possible value that could be in my XML feed in advance?