views:

59

answers:

2

Hi, I have many many XML files that often contain nodes mutiple times (each time with different data). Example:

 <?xml version="1.0" encoding="UTF-8"?>  
    <SomeName>  
      <Node>
        DataA
     </Node>  
     <Node>
        DataB
     </Node>  
      <Node>
        DataC
     </Node>  
      <AnotherNode>
        DataD
     </AnotherNode>
      <AnotherNode>
        DataE
     </AnotherNode>
      <AnotherNode>
        DataF
     </AnotherNode>
     <SingleNode>
        DataG
     </SingleNode>
   </SomeName>  

The desired Output would be:

  <?xml version="1.0" encoding="UTF-8"?>  
    <SomeName>  
      <Node1>
        DataA
     </Node1>  
     <Node2>
        DataB
     </Node2>  
      <Node3>
        DataC
     </Node3>  
      <AnotherNode1>
        DataD
     </AnotherNode1>
      <AnotherNode2>
        DataE
     </AnotherNode2>
      <AnotherNode3>
        DataF
     </AnotherNode3>
     <SingleNode>
        DataG
     </SingleNode>
   </SomeName>  

The Problem is, I don't have a list of all the duplicate Nodenames, so I need the XSLT to run through all nodes and only number those that exist multiple times. Is that possible?

Does anyone have a good idea on how to accomplish that?

Thanks!

+2  A: 

You can use count(preceding-sibling::*[name(.) = name(current())]) to get the number of preceding sibling elements with the same name as the context element, and <xsl:element name="concat(name(.),'n')" /> to create an element of the same name as the context element, with the letter 'n' appended to it. Combining these facts should allow you to achieve the effect you desire.

Paul Butcher
Using `count(preceding-sibling::*[something])` is quite inefficient (O(N^2)). See my answer for a more efficient solution.
Dimitre Novatchev
+1  A: 

Here is a complete solution. It is recommended to use the Muenchian method for grouping and not grouping based on count(preceding::*[someCondition]), which is grossly inefficient -- O(N^2).

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:key name="kElsByName"
  match="/*/*" use="name()"/>

 <xsl:template match="/*">
   <SomeName>
     <xsl:for-each select=
      "*[generate-id()
        =
         generate-id(key('kElsByName', name())[1])
        ]
      ">

        <xsl:variable name="vsameNamedNodes" select=
         "key('kElsByName', name())"/>

        <xsl:variable name="vNumSameNamedNodes" select=
         "count($vsameNamedNodes)"/>

        <xsl:for-each select="$vsameNamedNodes">

         <xsl:element name="{concat(name(),
                             substring(position(),
                                       1 div ($vNumSameNamedNodes > 1)
                                       )
                                    )
                             }">
           <xsl:copy-of select="node()"/>
         </xsl:element>
       </xsl:for-each>
     </xsl:for-each>
   </SomeName>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

    <SomeName>
      <Node>
        DataA
     </Node>
     <Node>
        DataB
     </Node>
      <Node>
        DataC
     </Node>
      <AnotherNode>
        DataD
     </AnotherNode>
      <AnotherNode>
        DataE
     </AnotherNode>
      <AnotherNode>
        DataF
     </AnotherNode>
     <SingleNode>
        DataG
     </SingleNode>
   </SomeName>

produces the wanted result:

<SomeName>
    <Node1>
        DataA
    </Node1>
    <Node2>
        DataB
    </Node2>
    <Node3>
        DataC
    </Node3>
    <AnotherNode1>
        DataD
    </AnotherNode1>
    <AnotherNode2>
        DataE
    </AnotherNode2>
    <AnotherNode3>
        DataF
    </AnotherNode3>
    <SingleNode>
        DataG
    </SingleNode>
</SomeName>
Dimitre Novatchev
Wow. That worked beautifully and quickly too! Thanks a lot for that XSLT! All I changed was putting method="xml" into the header to have XML outputs, but otherwise your solution is perfect.Thanks again!
Grinner
Actually, is there a possibility to change that to not rename unique nodes? Basically have the XSLT not rename the SingleNode?Thanks!
Grinner
@Grinner: yes, but it is inconvenient to write the new solution into a comment. I may edit my answer to include this last requirement.
Dimitre Novatchev
@Grinner: I edited my answer. The solution now produces exactly the output you want.
Dimitre Novatchev
Fantastic. Thank you very uch. That did the trick perfectly.
Grinner