It is quite likely that your problem arises due to the presence of a default (xhtml) namespace in the source XHTML file (which you have not shown to us, so this is a guess at best).
Can someone explain the correct way to
remove and all its
children from the identity transform?
Here is how to do this in case a default namespace is present:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="xhtml:div[@class='foo']"/>
</xsl:stylesheet>
When this transformation is applied on the following XHTML document:
<html xmlns="http://www.w3.org/1999/xhtml">
<div class="class1">
<p>Text1</p>
</div>
<div class="foo">
<p>Text foo</p>
</div>
<div class="class2">
<p>Text2</p>
</div>
</html>
the wanted, correct result is produced:
<html xmlns="http://www.w3.org/1999/xhtml">
<div class="class1">
<p>Text1</p>
</div>
<div class="class2">
<p>Text2</p>
</div>
</html>
Using a namespace prefix in the match expression of the template is necessary, because XPath considers any unprefixed name in "no namespace" and a match expression with non-prefixed names does not match any nodes, because it specifies nodes in "no namspace", but all the nodes of the source document are in the XHTML namespace.
In case there is no default namespace in the source document, the transformation can be simplified:
When this transformation is applied on the following XML document (note that it doesn't define a default namespace):
<html>
<div class="class1">
<p>Text1</p>
</div>
<div class="foo">
<p>Text foo</p>
</div>
<div class="class2">
<p>Text2</p>
</div>
</html>
the wanted, correct result is produced:
<html>
<div class="class1">
<p>Text1</p>
</div>
<div class="class2">
<p>Text2</p>
</div>
</html>
Both transformation use the identity rule to copy any node of the document and another template, which overrides the identity rule for nodes matching "div[@class='foo']"
. This second template is empty (has no body), which means that the matched node and the subtree rooted in it are not processed at all (ignored) and thus will not appear in the output.