tags:

views:

83

answers:

2

I have created a stylesheet that is supposed to selectively copy the contents of an XML document so that I can strip out data that we do not need. I have provided 2 examples below and the stylesheet that we are currently using to do this. The stylesheet works, but I think there is probably a better way of doing it because in the current version I check for the same thing in two different locations (author='John Doe').

The rules for including an xml element in the output is as follows:

  • If there is a notepad element within notepads that has the author text equal to 'John Doe' then include the notepads element in the output
  • If the notepad element has an author element with text equal to 'John Doe' then include all elements within the notepad element in the xml output.

Input Example #1

<transaction>  
<policy>  
    <insco>CC</insco>  
    <notepads>  
      <notepad>  
        <author>Andy</author>  
      <notepad>  
      <notepad>  
        <author>John Doe</author>  
      <notepad>  
      <notepad>  
        <author>Barney</author>  
      <notepad>  
    </notepads>  
  </policy>  
</transaction>

Expected result for Input #1

<transaction>
  <policy>
    <insco>CC</insco>
    <notepads>
      <notepad>
        <author>John Doe</author>
      <notepad>
    </notepads>
  </policy>
</transaction>

Input Example #2

<transaction>
  <policy>
    <insco>CC</insco>
    <notepads>
      <notepad>
        <author>Andy</author>
      <notepad>
    </notepads>
  </policy>
</transaction>

Expected Result for Input #2

<transaction>
  <policy>
    <insco>CC</insco>
  </policy>
</transaction>

Current Version of Stylesheet

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0" xmlns:fn="http://www.w3.org/2005/xpath-functions" exclude-result-prefixes="fn">
  <xsl:template match="*">
      <xsl:choose>
        <xsl:when test="name()='notepads'">
          <xsl:if test="/transaction/policy/insco='CC' and (notepad/author='John Doe')">
            <xsl:copy>
              <xsl:apply-templates />
            </xsl:copy>              
          </xsl:if>
        </xsl:when>
        <xsl:when test="name()='notepad'">
          <xsl:if test="author='John Doe'">
            <xsl:copy>
              <xsl:apply-templates />
            </xsl:copy>              
          </xsl:if>                
        </xsl:when>
        <xsl:otherwise>
          <xsl:copy>
            <xsl:apply-templates />
          </xsl:copy>
        </xsl:otherwise>
      </xsl:choose>
  </xsl:template>
</xsl:stylesheet>
+1  A: 

Use templates, they're usually more efficient, and avoid name() checks, they're slow and unreliable (prefixes and namespaces don't work well with those):

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="*">
        <xsl:copy>
            <xsl:apply-templates />
        </xsl:copy>
    </xsl:template>

    <xsl:template match="notepads">
        <xsl:if test="(ancestor::policy/insco='CC') and (notepad/author='John Doe')">
            <xsl:copy>
                <xsl:apply-templates />
            </xsl:copy>
        </xsl:if>
    </xsl:template>

    <xsl:template match="notepad">
        <xsl:if test="author='John Doe'">
            <xsl:copy>
                <xsl:apply-templates />
            </xsl:copy>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>
Lucero
Thanks for the suggestion. That does look a lot cleaner than the current version.
jwmajors81
+1  A: 

I can think of two ways to do this.

1) identity template, hard coded author name:

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <!-- nodepads generally get removed... -->    
  <xsl:template match="notepad" />

  <!-- ...unless their author is 'Jon Doe' -->    
  <xsl:template match="notepad[author='John Doe']">
    <xsl:copy-of select="." />
  </xsl:template>

</xsl:stylesheet>

2) modified identity template, XSL key, parameterized author name:

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
  <xsl:param name="theAuthor" select="'John Doe'" />

  <xsl:key 
    name="kNotepad" match="notepad[author]" 
    use="concat(generate-id(..), '|', author)" 
  />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="
        node()[not(self::notepad)]
        |key('kNotepad', concat(generate-id(), '|', $theAuthor))
        |@*" 
      />
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

The second approach requires a little explanation:

  • the <xsl:key> indexes all <nodepad> nodes that have an <author> by their parent's unique ID and the author name
  • let's say the unique ID of <notepads> is 'id0815', then the key to the <notepad>s you are interested in would be 'id0815|Jon Doe'
  • the identity template copies every node that is passed through it. It is modified in a way that it does not pass every node it finds through itself, but rather only:
    • any node that is not a <notepad>: node()[not(self::notepad)]
    • any attribute: @*
    • any node that is returned by the key.
  • the call to key() naturally only ever returns anything at all on <notepads> elements (because it contains their unique ID)
  • so when the template is currently processing a <notepads> element ('id0815' in our case), key() will return its 'Jon Doe' children only, in all other cases it will turn up empty
  • in contrast to solution 1) this one can be fed with a parameter, changing its behavior without changing its code
  • it is worth noting that everything stays in input document order
Tomalak
Thank you for your answer. I don't know if the added complexity is worth it (the name shouldn't change or if it does it will change every couple years), but I can definitely see the usefulness of the technique you have displayed here.
jwmajors81
The complexity of both approaches is roughly comparable - of course depending on your familiarity with XSLT. In any case, you decide. ;)
Tomalak
@Tomalak: It seems to me that you "lost" the `/transaction/policy/insco='CC'` check...
Lucero
@Lucero: Yeah, this was not part of the written requirement and I kinda overlooked it in the supplied code sample. Not sure why this is in the example anyway. A template match="transaction" that checks for this and calls apply-templates conditionally would fix it.
Tomalak