I've got wads of autogenerated HTML doing stupid things like this:
<p>Hey it's <em>italic</em><em>italic</em>!</p>
And I'd like to mash that down to:
<p>Hey it's <em>italicitalic</em>!</p>
My first attempt was along these lines...
<xsl:template match="em/preceding::em">
<xsl:value-of select="$OPEN_EM"/>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="em/following::em">
<xsl:apply-templates/>
<xsl:value-of select="$CLOSE_EM"/>
</xsl:template>
But apparently the XSLT spec in its grandmotherly kindness forbids the use of the standard XPath preceding
or following
axes in template matchers. (And that would need some tweaking to handle three ems in a row anyway.)
Any solutions better than forgetting about doing this in XSLT and just running a replace('</em><em>', '')
in $LANGUAGE_OF_CHOICE on the end result? Rough requirements: should not combine two <em>
if they are separated by anything (whitespace, text, tags), and while it doesn't have to merge them, it should at least produce valid XML if there are three or more <em>
in a row. Handling tags nested within the ems (including other ems) is not required.
(And oh, I've seen http://stackoverflow.com/questions/1542775/how-to-merge-element-using-xslt, which is similar but not quite the same. XSLT 2 is regrettably not an option and the proposed solutions look hideously complex.)