views:

377

answers:

2

Hi,

I really need an answer to this question. I am working on a project which uses XML to make pages, then XSLT to produce it to a web page. Here is a code sample:

 public function transform ($xml) {
  $proc = new XSLTProcessor;
  $proc->importStyleSheet ($this->xsl);
  $output = $proc->transformToXML ($xml);
  return $output;
 }

the $xml contains the web page in XML format, for example:

<?xml version="1.0" encoding="utf-8"?>
<page>
 <meta>
  <language>en</language>
  <title>Main page</title>

 <content>
<![CDATA[
  lot's of content here along with <p>html</p>.
]]>
 </content>

</page>

So, the web page is in XML and passed to the transform function, which loads an XSL file to transform the XML to a HTML web page. However, the above scripts will fail to load the content tag properly, because XSLT for some reason doesn't understand CDATA, and encodes the HTML to entities.

So, the question is, how can I output, using XSLT, the HTML as HTML without getting it encoded into entities?

A: 

See if this helps. I'd put more here in the answer but I think the link does a better job explaining how to achieve such a transform result from XSLT using the LexEv XMLReader, which is a wrapper for the standard XMLReader used by XSLT and Saxon.

Ryan Lynch
+2  A: 

I'm sure you don't have something like this in your XSLT:

<xsl:template match="content">
  <xsl:value-of select="." disable-output-escaping="yes" />
</xsl:template>

XSLT "understands" CDATA perfectly well. More exactly - it not concerned with CDATA at all, this is the task of the underlying XML DOM parser which makes a text value out of it.

From an XSLT point of view, there is no way of knowing whether the string

"<div>bla &amp; bla</div>"

came out of

<xml>&lt;div&gt;bla &amp;amp; bla&lt;/div&gt;</div>

or

<xml><![CDATA[<div>bla &amp; bla</div>]]></div>

CDATA is merely a serialization convenience. The resulting info set/DOM is the same. And unless you disable output escaping, XSLT correctly produces the following value from the above string:

&lt;div&gt;bla &amp;amp; bla&lt;/div&gt;

Which is the reason for the fact that you see HTML code on the rendered page.

Tomalak