ansaurus

Question

Answer 1

+1 A:

It is quite likely that your problem arises due to the presence of a default (xhtml) namespace in the source XHTML file (which you have not shown to us, so this is a guess at best).

Can someone explain the correct way to remove and all its children from the identity transform?

Here is how to do this in case a default namespace is present:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xhtml="http://www.w3.org/1999/xhtml"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="xhtml:div[@class='foo']"/>
</xsl:stylesheet>

When this transformation is applied on the following XHTML document:

<html xmlns="http://www.w3.org/1999/xhtml"&gt;
    <div class="class1">
        <p>Text1</p>
    </div>
    <div class="foo">
        <p>Text foo</p>
    </div>
    <div class="class2">
        <p>Text2</p>
    </div>
</html>

the wanted, correct result is produced:

<html xmlns="http://www.w3.org/1999/xhtml"&gt;
   <div class="class1">
      <p>Text1</p>
   </div>
   <div class="class2">
      <p>Text2</p>
   </div>
</html>

Using a namespace prefix in the match expression of the template is necessary, because XPath considers any unprefixed name in "no namespace" and a match expression with non-prefixed names does not match any nodes, because it specifies nodes in "no namspace", but all the nodes of the source document are in the XHTML namespace.

In case there is no default namespace in the source document, the transformation can be simplified:

When this transformation is applied on the following XML document (note that it doesn't define a default namespace):

<html>
    <div class="class1">
        <p>Text1</p>
    </div>
    <div class="foo">
        <p>Text foo</p>
    </div>
    <div class="class2">
        <p>Text2</p>
    </div>
</html>

the wanted, correct result is produced:

<html>
   <div class="class1">
      <p>Text1</p>
   </div>
   <div class="class2">
      <p>Text2</p>
   </div>
</html>

Both transformation use the identity rule to copy any node of the document and another template, which overrides the identity rule for nodes matching "div[@class='foo']". This second template is empty (has no body), which means that the matched node and the subtree rooted in it are not processed at all (ignored) and thus will not appear in the output.

Dimitre Novatchev 2010-08-01 01:05:52

Your code works, but why doesn't this work: <xsl:output omit-xml-declaration="no" indent="yes" method="xml" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" />

jbtx 2010-08-02 00:29:53

I ended up using two stylesheets: one to strip the elements and a second to add the correct DTD statements.

jbtx 2010-08-02 00:57:43

@jbtx: Why, it wors for me. I get at the start of the output: `<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">` . Maybe something is wrong with your XSLT processor? My result is produced by Saxon 6.5.4, MSXML 3-6, AltovaXML (XML-SPY), Saxon 9.1.07 and XML-SPY-XSLT2.0. If you are using one of the two available .NET XSLT processors, you have to fine-tune the settings of the `XmlWriter` that is passed as an argument to the `Transform()` method. Read your documentation.

Dimitre Novatchev 2010-08-02 02:21:21

ansaurus

tags:

views:

answers:

XSLT to transform a XHTML document

related questions