tags:

views:

183

answers:

1

Given the following XML I need to place the contents of the suffix tag (minus the suffix tag itself) directly at the end of the last non-empty text node of a quote-block (as noted with comments in XML).

My current code mostly works (stylesheet below) but I can't seem to correctly limit the the appended suffix content to just the last node in the quote-block.

When the appending code was limited to just handling text() it worked fine using the following if statement:

<xsl:if test="generate-id() = generate-id(key('kQbText', generate-id($qb))[last()])">

But it doesn't work with child nodes, any ideas? This has me stumped.

XML:

<?xml version="1.0" encoding="utf-8"?>    
<paragraph>
    <para>
        <quote-block>
            <list prefix-rules="ordered">
                <item>
                    <para>
                        Lorem ipsum dolor sit amet, consectetur 
                        adipiscing elit. Fusce vel lorem purus, 
                        et scelerisque nibh:
                        <quote-block>
                            <quote-para>
                                "Suspendisse egestas fringilla purus. 
                                Aenean vitae augue vitae nibh convallis.
                            </quote-para>
                            <!-- last node of quote-block excluding suffix -->
                            <quote-para>
                                <emphasis strength="strong">Mperdiet vel ut 
                                orci. Ut sed neque id libero</emphasis>. 
                                cursus mattis. Phasellus eros leo, luctus 
                                in viverra dignissim."
                            </quote-para>
                            <suffix>(Emphasis added.)</suffix>
                        </quote-block>
                    </para>
                </item>             
                <item>
                    <!-- last node of quote-block excluding suffix -->
                    <para>convallis sit amet elit, mauris nisl arcu.</para>
                </item>
            </list>
            <suffix>(emphasis in original)</suffix>
        </quote-block>
    </para>
    <para>
        Curabitur suscipit, massa eu congue suscipit, tortor:
        <quote-block>
            <!-- last node of quote-block excluding suffix -->
            <quote-para>
                "estibulum quis suscipit purus. Proin ultricies 
                scelerisque egestas."
            </quote-para>
            <suffix>
                <note>
                    <note-para>Footnote text</note-para>
                </note>
            </suffix>
        </quote-block> 
        Nunc ac odio in turpis suscipit tristique.
    </para>
</paragraph>

Stylesheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"   
exclude-result-prefixes="xs"
version="2.0">

    <xsl:output method="xml" encoding="utf-8" indent="no"/>

    <!-- key to identify all non-empty, non-suffix text node descendants of
      a quote-block. We'll use that to pull out the "last one" later-on -->
    <xsl:key name="kQbText"
    match="*:quote-block//text()[not(normalize-space() = '' or parent::*:suffix)]"
    use="generate-id(ancestor::*:quote-block[1])"/>

    <!-- identity template to copy everything that is not otherwise handled -->
    <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
    </xsl:template>

    <!-- special handling for text nodes that are descendants of quote-blocks -->
    <xsl:template match="*:quote-block//text()[not(normalize-space() = '' or parent::*:suffix)]">
    <xsl:variable name="qb" select="ancestor::*:quote-block[1]"/>

    <!-- the text node gets copied regardless -->
    <xsl:copy-of select="."/>

    <!-- if it is the last non-empty text node, append all suffices -->
    <!--<xsl:if test="generate-id() = generate-id(key('kQbText', generate-id($qb))[last()])">-->
      <xsl:for-each select="$qb/*:suffix">
        <xsl:text> </xsl:text>
        <!-- copy contents of suffix node without suffix tags, inc child nodes -->
        <xsl:copy-of select="node()"/>
      </xsl:for-each>
    <!--</xsl:if>-->
    </xsl:template>

    <!-- empty text nodes will be removed (all others are copied) -->
    <xsl:template match="text()[normalize-space() = '']"/>

    <!-- suffix nodes will be deleted-->
    <xsl:template match="*:suffix"/>
</stylesheet>
A: 

Assuming XSLT 2.0, I think the following stylesheet does what you want:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0">

  <xsl:output method="xml" encoding="utf-8" indent="no"/>

  <xsl:key name="kQbText"
    match="quote-block//text()[not(normalize-space() = '' or ancestor::suffix)]"
    use="generate-id(ancestor::*:quote-block[1])"/>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@*, node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="quote-block//text()[not(normalize-space() = '' or ancestor::suffix)][. is key('kQbText', generate-id(ancestor::quote-block[1]))[last()]]">
    <xsl:copy/>
    <xsl:copy-of select="ancestor::quote-block[1]/suffix/node()"/>
  </xsl:template>

  <xsl:template match="quote-block/suffix"/>

</xsl:stylesheet>

With Saxon 9.2, when applied to the input you posted, the output is as follows:

<?xml version="1.0" encoding="utf-8"?><paragraph>
    <para>
        <quote-block>
            <list prefix-rules="ordered">
                <item>
                    <para>
                        Lorem ipsum dolor sit amet, consectetur 
                        adipiscing elit. Fusce vel lorem purus, 
                        et scelerisque nibh:
                        <quote-block>
                            <quote-para>
                                "Suspendisse egestas fringilla purus. 
                                Aenean vitae augue vitae nibh convallis.
                            </quote-para>
                            <!-- last node of quote-block excluding suffix -->
                            <quote-para>
                                <emphasis strength="strong">Mperdiet vel ut 
                                orci. Ut sed neque id libero</emphasis>. 
                                cursus mattis. Phasellus eros leo, luctus 
                                in viverra dignissim."
                            (Emphasis added.)</quote-para>

                        </quote-block>
                    </para>
                </item>             
                <item>
                    <!-- last node of quote-block excluding suffix -->
                    <para>convallis sit amet elit, mauris nisl arcu.(emphasis in original)</para>
                </item>
            </list>

        </quote-block>
    </para>
    <para>
        Curabitur suscipit, massa eu congue suscipit, tortor:
        <quote-block>
            <!-- last node of quote-block excluding suffix -->
            <quote-para>
                "estibulum quis suscipit purus. Proin ultricies 
                scelerisque egestas."

                <note>
                    <note-para>Footnote text</note-para>
                </note>
            </quote-para>

        </quote-block> 
        Nunc ac odio in turpis suscipit tristique.
    </para>
</paragraph>
Martin Honnen
You change is exactly what I needed, many thanks.
Mike