tags:

views:

249

answers:

3

First of all, if someone has a different, perhaps shorter (or better), solution to the problem, it's welcome as well.


I'm trying to "simply" remove (almost) duplicate elements in XSLT. There's some (metadata) nodes i don't want to include when comparing, and i couldn't figure out how do do that in XSLT so thought i'd extend it with a function removing these nodes. Like so:

<xsl:for-each select="abx:removeNodes(d/df600|d/df610|d/df611|d/df630|d/df650|d/df651|d/df655, '*[@key=&quot;i1&quot; or @key=&quot;i2&quot; or key=&quot;db&quot;]')">
   <xsl:if test="not(node()=preceding-sibling::*)">
      blah
   </xsl:if>
</xsl:for-each>

And the extension, which doesn't work so well... (C#)

public XPathNodeIterator removeNodes(XPathNodeIterator p_NodeIterator, String removeXPath)
{
   Logger Logger = new Logger("xslt");
   Logger.Log("removeNodes(removeXPath={0}):", removeXPath);

   foreach (XPathNavigator CurrentNode in p_NodeIterator)
   {
      Logger.Log("removeNodes(): CurrentNode.OuterXml={0}.", CurrentNode.OuterXml);

      foreach (XPathNavigator CurrentSubNode in CurrentNode.Select(removeXPath))
      {
         Logger.Log("removeNodes(): CurrentSubNode.OuterXml={0}.", CurrentSubNode.OuterXml);
         // How do i delete this node!?
         //CurrentSubNode.DeleteSelf();
      }
   }

   return p_NodeIterator;
}

My initial approach using 'CurrentSubNode.DeleteSelf();' doesn't work because it gets confused and loses its position in the XPathNavigator, causing it to only delete the first item it finds using "removeXPath". Something like a DeleteAndMoveNext() would be nice but there seems to be no such method...


Example data:

<df650>
  <df650 key="i1"> </df650>
  <df650 key="i2">0</df650>
  <df650 key="a">foo</df650>
  <df650 key="x">bar</df650>
  <df650 key="db">someDB</df650>
  <df650 key="id">b2</df650>
  <df650 key="dsname">someDS</df650>
</df650>

..and then another identical node (if you ignore the meta fields; db,id,dsname).

<df650>
  <df650 key="i1"> </df650>
  <df650 key="i2">0</df650>
  <df650 key="a">foo</df650>
  <df650 key="x">bar</df650>
  <df650 key="db">someOtherDB</df650>
  <df650 key="id">b2</df650>
  <df650 key="dsname">someOtherDS</df650>
</df650>

The result should be...

<df650>
  <df650 key="i1"> </df650>
  <df650 key="i2">0</df650>
  <df650 key="a">foo</df650>
  <df650 key="x">bar</df650>
</df650>
A: 

You can do that in XSLT alone easily, an extension function really not necessary. Consider this:

<!-- make a template that matches all nodes that cold be removed -->
<xsl:template match="d/df600|d/df610|d/df611|d/df630|d/df650|d/df651|d/df655">
  <!-- check the your condition for node removal, whatever it may be -->
  <xsl:if test="not(@key='i1' or @key='i2' or @key='db')">
    <!-- ...if it is *not* met, copy the node -->
    <xsl:copy-of select="." />
  </xsl:if>
  <!-- ...in all other cases, nothing happens, i.e. the node is removed -->
</xsl:template>
Tomalak
Ah, thanks! My data looks a bit different though, which makes it a bit trickier. The nodes look like d/df600/df600/@i1 etc.. Also, i would like to still output those nodes under the 'normal record view' in a 'debug record view' when in debug mode, perhaps i can use the @mode attribute for that?
RymdPung
@RymdPung: You can declare a `<xsl:param name="debug" select="false()" />`, so you can modify behavior from the outside. Just drop in an `<xsl:if test="$debug">` where appropriate. If you have trouble adapting my code to your needs, show your input XML and specify what it should look like under which circumstances.
Tomalak
The data is in a modified (don't ask me why) version of the MarcXML format (http://www.loc.gov/standards/marcxml/). I cannot post long messages in the comment so I've edited my original message with data examples.The output of the two views are very different, so i think using XSLT's @mode to split the views is a bit cleaner.
RymdPung
A: 

The problem can be solved like this (however, it doesn't solve MY actual problem...).

  • Create a List of type XPathNavigator that will contain nodes you want to delete.
  • Add the nodes to this list instead of using DeleteSelf().
  • When done finding all nodes you want to delete, iterate through your List and delete the nodes. Since these nodes are Navigators, there is no issue with lost position.

I gave up on trying to paste the code in after 10 minutes...

RymdPung
A: 

Thanks for the hint RymdPung, I was able to remove blank rows in a repeating section using your List suggestion.

I added a reference to the System.Collections.Generic namespace in my code.

Here is the method I created to scan through a set of nodes, identify the ones I want to remove, and then delete these in a separate loop.

         private void deleteEmptyRows(string path)
    {
        XPathNodeIterator nodesToCheck = MainDataSource.CreateNavigator().Select(path, NamespaceManager);
        List<XPathNavigator> nodesToDelete = new List<XPathNavigator>();
        foreach (XPathNavigator currentItem in nodesToCheck)
            if (currentItem.Value.Trim().Length == 0)
                nodesToDelete.Add(currentItem);

        foreach(XPathNavigator deleteMe in nodesToDelete)
            deleteMe.DeleteSelf();
    }
Ivan Wilson