ansaurus

Question

In XSLT, how do you select/copy part of the document as text only?

Answer 1

A:

Try this, but note that it won't handle nested CDATA correctly:

<xsl:text disable-output-escaping="yes">&lt;</xsl:text>
<xsl:text>![CDATA[</xsl:text>

<xsl:copy-of select="..." />

<xsl:text>]]</xsl:text>
<xsl:text disable-output-escaping="yes">&gt;</xsl:text>

Pavel Minaev 2009-11-16 23:38:32

On reflexion, that might work in a sick sort of way. But too late to remove downvote.

ddaa 2009-11-16 23:42:41

It will work so long as processor supports disabling output escaping (which is non mandatory). I wasn't quite correct about the severity of the problem with nested CDATA, too - in XSLT 1.0, a processor can never _output_ CDATA (even if input had CDATA) - it has to escape using character entities instead, except if `cdata-section-elements` is used to tell it to do otherwise.

Pavel Minaev 2009-11-16 23:47:02

Well, that's an interesting stunt. This sort of things are liable to summon the Dark One, but the alternative would be insanely complex.

ddaa 2009-11-16 23:54:20

Answer 2

+1 A:

That's a strange requirement.

Since XSLT works on a parsed document model, you cannot do this reliably. In particular, the distinction between equivalent notations will necessarily be lost. Equivalent notations include things like <tag></tag> versus <tag/>, or é versus é.

That said, a general approach that might work would be using the mode attribute of xsl:template and xsl:apply-template to switch to a mode that explicitly render all elements as text. In effect you would be writing a XML serializer in XSLT.

One issue though is that you would have to double-escape special characters such as <>"' when present in attribute values and text nodes. And XSLT is quite inefficient at this sort of string munging.

Another issue would be rendering namespace prefixes reasonably. You can almost certainly do it, but that be quite horrible.

ddaa 2009-11-16 23:49:23

ansaurus

tags:

views:

answers:

In XSLT, how do you select/copy part of the document as text only?

related questions