tags:

views:

59

answers:

3

Im working with PHP5, and I need to transform XML in the following form:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <anotherNode>some text</anotherNode
        <item label="a">some text</item>
        <item label="b">some text</item>          
    </item>
</list>

Into something like this:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <anotherNode>some text</anotherNode>
        <list> <!-- opening new wrapper node-->
            <item label="a">some text</item>
            <item label="b">some text</item>
        </list> <!-- closing new wrapper node-->
    </item>
</list> 

As you can see above I need to add a wrapper node to any 'item' nodes that are not wrapped by the 'list' node already.

What are possible solutions for transforming source xml to the target xml?

UPDATED:

Note 1: Any single or group of <item> nodes needs to be wrapped by a <list> node if its not wrapped already.

Note 2: Order of the content needs to be maintained.

Note 3: If there are <item> nodes before and after <anotherNode>. It should transform this:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <item label="a">some text</item>
        <item label="b">some text</item>          
        <anotherNode>some text</anotherNode>
        <item label="c">some text</item>
        <item label="d">some text</item>          
    </item>
</list>

into this:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <list> <!-- opening new wrapper node-->
            <item label="a">some text</item>
            <item label="b">some text</item>          
        </list> <!-- closing new wrapper node-->
        <anotherNode>some text</anotherNode>
        <list> <!-- opening new wrapper node-->
            <item label="c">some text</item>
            <item label="d">some text</item>
        </list> <!-- closing new wrapper node-->
    </item>
</list>

Thanks,

+3  A: 

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:template match="@*|node()" name="identity">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()[1]" />
        </xsl:copy>
        <xsl:apply-templates select="following-sibling::node()[1]" />
    </xsl:template>
    <xsl:template match="*[not(self::list)]
                          /item[not(preceding-sibling::*[1][self::item])]">
        <list>
            <xsl:call-template name="identity"/>
        </list>
        <xsl:apply-templates select="following-sibling::node()
                                      [not(self::item)][1]" />
    </xsl:template>
    <xsl:template match="*[not(self::list)]
                          /item[not(following-sibling::*[1][self::item])]">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()[1]" />
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Output:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <anotherNode>some text</anotherNode>
        <list>
            <item label="a">some text</item>
            <item label="b">some text</item>
        </list>
    </item>
</list>

Also, this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:key name="kItemByFirstSibling"
             match="item[preceding-sibling::*[1][self::item]]"
             use="generate-id(preceding-sibling::item
                               [not(preceding-sibling::*[1][self::item])][1])"/>
    <xsl:template match="@*|node()" name="identity">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()" />
        </xsl:copy>
    </xsl:template>
    <xsl:template match="*[not(self::list)]/item"/>
    <xsl:template match="*[not(self::list)]
                          /item[not(preceding-sibling::*[1][self::item])]"
                  priority="1">
        <list>
            <xsl:for-each select=".|key('kItemByFirstSibling',generate-id())">
                <xsl:call-template name="identity"/>
            </xsl:for-each>
        </list>
    </xsl:template>
</xsl:stylesheet>

Note: First stylesheet use most fine grained transversal (it will wrap any node after first item). Second stylesheet full recursive identity transform.

Edit: Addressing new requeriment, with new input, both stylesheets output:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <list>
            <item label="a">some text</item>
            <item label="b">some text</item>
        </list>
        <anotherNode>some text</anotherNode>
        <list>
            <item label="c">some text</item>
            <item label="d">some text</item>
        </list>
    </item>
</list>
Alejandro
+1 for the fine-grained solution.
Dimitre Novatchev
@Alejandro: see my comment to Dimitre... this works fine assuming all item elements that need to be wrapped, that are children of the same non-list element, are contiguous.
LarsH
@LarsH: As I wrote in notes, first stylesheet wrap **any node** after first `item`. Second stylesheet wrap **any `item`** without `list` parent, and the relative order (between siblings) will be the relative order of the first `item`. I think this is a fair questions, but only matters if OP wants to output sibilings `list` parents.
Alejandro
+5  A: 

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="item/item[1]">
  <list>
   <xsl:apply-templates mode="copy"
    select=".| following-sibling::item"/>
  </list>
 </xsl:template>

 <xsl:template match="item" mode="copy">
  <xsl:call-template name="identity"/>
 </xsl:template>

 <xsl:template match="item/item[not(position()=1)]"/>
</xsl:stylesheet>

when applied on the provided XML document:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <anotherNode>some text</anotherNode>
        <item label="a">some text</item>
        <item label="b">some text</item>
    </item>
</list>

produces the wanted, correct result:

<list>
   <item label="(1)">some text</item>
   <item label="(2)">
      <anotherNode>some text</anotherNode>
      <list>
         <item label="a">some text</item>
         <item label="b">some text</item>
      </list>
   </item>
</list>

Do note:

  1. The use and overriding of the Identity rule.

  2. The suppression of certain elements.

  3. The processing of certain elements using a different mode.

Update:

The OP has added additional requirements:

"In case there are item elements before anothernode and after it, then each such group of item elements must be enclosed in a separate list"

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kfollnonitem" match="item"
  use="generate-id(preceding-sibling::*[not(self::item)][1])"/>

 <xsl:key name="kprecnonitem" match="item"
  use="generate-id(following-sibling::*[not(self::item)][1])"/>

 <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="*[not(self::list)]/item[1]">
  <list>
   <xsl:apply-templates mode="copy"
    select="key('kprecnonitem',
                 generate-id(following-sibling::*[not(self::item)][1])
                 )"/>
  </list>
 </xsl:template>

 <xsl:template match=
  "*[not(self::list) and item]/*[not(self::item)]">
  <xsl:call-template name="identity"/>

  <list>
    <xsl:apply-templates mode="copy"
     select="key('kfollnonitem', generate-id())"/>
  </list>
 </xsl:template>

 <xsl:template match="item" mode="copy">
  <xsl:call-template name="identity"/>
 </xsl:template>

 <xsl:template match="item/item[not(position()=1)]"/>
</xsl:stylesheet>

when this transformation is performed against he following XML document:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <item label="a">some text</item>
        <item label="b">some text</item>
        <anotherNode>some text</anotherNode>
        <item label="c">some text</item>
        <item label="d">some text</item>
    </item>
</list>

the wanted, correct result is produced:

<list>
   <item label="(1)">some text</item>
   <item label="(2)">
      <list>
         <item label="a">some text</item>
         <item label="b">some text</item>
      </list>
      <anotherNode>some text</anotherNode>
      <list>
         <item label="c">some text</item>
         <item label="d">some text</item>
      </list>
   </item>
</list>
Dimitre Novatchev
@Dimitre: +1 for mode solution.
Alejandro
@Dimitre: what happens if there are `<item>` elements both before _and after_ the `<anotherNode>`? Your solution will put them all before `<anotherNode>`. @Benjamin, what **should** happen in such cases?
LarsH
@LarsH: Good observation. The desired order of siblings with other names and the list of items isn't specified. My solution preserves the relative (document) ordering between the first `item` and any non-item sibling. I think this is natural. In case another ordering is desired, it is easy to modify the solution to meet the new requirement,
Dimitre Novatchev
If there are <item> elements before and after <anotherNode> each of these groups of <item> should have their own <list> wrapper. The order of the content should be preserved.
Benjamin Ortuzar
@Benjamin-Ortuzar and @LarsH: I have updated my answer so that the solution fulfills the latest requirements. :)
Dimitre Novatchev
Sorry, much work today! I've updated mine, too.
Alejandro
A: 

You didn't address this in the original question, so it may not be required. But if the input has multiple sequences of <item> elements that need to be wrapped, that are separated from each other by other sibling elements, e.g.:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <item label="a">some text</item>
        <item label="b">some text</item>          
        <anotherNode>some text</anotherNode>
        <item label="c">some text</item>
        <item label="d">some text</item>          
    </item>
</list>

the earlier answers will, I believe, lump the <item> elements together, changing their order:

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <list> <!-- opening new wrapper node-->
            <item label="a">some text</item>
            <item label="b">some text</item>          
            <item label="c">some text</item>
            <item label="d">some text</item>
        </list> <!-- closing new wrapper node-->
        <anotherNode>some text</anotherNode>
    </item>
</list> 

Do you want that, or do you want to wrap them separately, like this?

<list>
    <item label="(1)">some text</item>
    <item label="(2)">
        <list> <!-- opening new wrapper node-->
            <item label="a">some text</item>
            <item label="b">some text</item>          
        </list> <!-- closing new wrapper node-->
        <anotherNode>some text</anotherNode>
        <list> <!-- opening new wrapper node-->
            <item label="c">some text</item>
            <item label="d">some text</item>
        </list> <!-- closing new wrapper node-->
    </item>
</list> 

If the latter, it will probably be easiest to use an XSLT 2.0 <xsl:for-each-group group-adjacent="name()" /> construction. I don't know whether PHP 5 has XSLT 2.0 available, but if you can use such a thing, see this good article.

LarsH
@LarsH I need the wrap them separately like on your second example.
Benjamin Ortuzar