tags:

views:

555

answers:

3

Hi,

I'm parsing a huge word file with Test descriptions, and have a problem of scope of nodes. Word basically creates a list of paragraphs and I want to group them into a parent node. So for each node 'A' I want to group all the following nodes up to the next node 'A' into 'A'.

How can this be done with XSL?

Example: I've gotten to :

<A/>
<ab/>
<ac/>
<A/>
<ab/>
<ac/>

But need:

<A>
<ab/>
<ac/>
</A>
<A>
<ab/>
<ac/>
</A>

Thank you!

+3  A: 

If you mean to match all the nodes following <A>, but come before the next <A>, I think you can use something like this:

<xsl:template match="A">
  <xsl:copy>
    <!-- start of range -->
    <xsl:variable name="start" select="count(preceding-sibling::*) + 1" />
    <!-- end of range -->
    <xsl:variable name="stop">
      <xsl:choose>
        <!-- either just before the next A node -->
        <xsl:when test="following-sibling::A">
          <xsl:value-of select="count(following-sibling::A[1]/preceding-sibling::*) + 1" />
        </xsl:when>
        <!-- or all the rest -->
        <xsl:otherwise>
          <xsl:value-of select="count(../*) + 1" />
        </xsl:otherwise>
      </xsl:choose>
    </xsl:variable>

    <!-- this for debugging only -->
    <xsl:attribute name="range">
      <xsl:value-of select="concat($start + 1, '-', $stop - 1)" />
    </xsl:attribute>

    <!-- copy all nodes in the calculated range -->
    <xsl:for-each select="../*[position() &gt; $start and position() &lt; $stop]">
      <xsl:copy-of select="." />
    </xsl:for-each>
  </xsl:copy>
</xsl:template>

For your input:

<root>
  <A />
  <ab />
  <ac />
  <A />
  <ab />
  <ac />
</root>

I get (I left the "range" attribute in to make the calculations visible):

<A range="2-3">
  <ab />
  <ac />
</A>
<A range="5-6">
  <ab />
  <ac />
</A>
Tomalak
There is probably a nicer way to do it. I am quite curious what solutions other people find.
Tomalak
Using keys is generally more efiicient and provides more compact solutions. See my answer.
Dimitre Novatchev
I ended up using a two-phase variant of this answer. Using the preceding-sibling axis, each child node would get an attribute, "belongs-to", and then i'd merge them in a second step. Thanks!
Hugo
I wonder why you chose to accept my answer, though. It's clearly inferior to Dimitre Novatchev's proposal.
Tomalak
+4  A: 

There is a simple and very powerful solution using keys.

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:key name="kFollowing" match="*[not(self::A)]"
  use="generate-id(preceding-sibling::A[1])"/>

    <xsl:template match="/*">
     <t>
       <xsl:apply-templates select="A"/>
     </t>
    </xsl:template>

    <xsl:template match="A">
     <A>
       <xsl:copy-of select=
          "key('kFollowing',generate-id())"/>
     </A>
    </xsl:template>
</xsl:stylesheet>

when applied on the original XML document:

<t>
    <A/>
    <ab/>
    <ac/>
    <A/>
    <ab/>
    <ac/>
</t>

produces the wanted result:

<t>
   <A>
      <ab/>
      <ac/>
   </A>
   <A>
      <ab/>
      <ac/>
   </A>
</t>

Do note how the definition of the <xsl:key>, combined with the use of the key() function makes most easy and natural collecting all sibling elements between two neighboring <A/> elements.

Dimitre Novatchev
Thanks for the answer. Sorry for not replying earlier, had to put it on hold. This was the most elegant solution, but i could only almost get it to work. The As would contain all the ab/ac from the current node and forward. Thanks!
Hugo
@Hugo This is a solution to the problem described, and it produces the wanted result. In case you have a different problem, please, do post it so that it can be solved. You should not have any problem applying this solution to the current problem -- it just produces the wanted result.
Dimitre Novatchev
+2  A: 

XSLT 2.0 solution:

<xsl:for-each-group select="*" group-starting-with="A">
  <xsl:element name="{name(current-group()[1])}">
    <xsl:copy-of select="current-group()[position() gt 1]"/>  
  </xsl:element>
</xsl:for-each-group>
jelovirt
Very pretty! Can't use t because of Ant though. But thanks, got an upvote.
Hugo