tags:

views:

166

answers:

2
+3  Q: 

Concatenating XML

I have three files of xml

<step>
<Products>
    <Product UserTypeID="Country">
        <Name>Cyprus</Name>
        <Product UserTypeID="Resort">
            <Name>Argaka</Name>
            <Product UserTypeID="Property">
                <Name>Villa Tester</Name>
            </Product>
        </Product>
        <Product UserTypeID="Resort">
            <Name>Coral Bay</Name>
            <Product UserTypeID="Property">
                <Name>1</Name>
            </Product>
            <Product UserTypeID="Property">
                <Name>2</Name>
            </Product>
        </Product>
    </Product>
    <Product UserTypeID="Country">
        <Name>Greece</Name>
        <Product UserTypeID="Region">
            <Name>Corfu</Name>
            <Product UserTypeID="Resort">
                <Name>Aghios Stefanos</Name>
                <Product UserTypeID="Property">
                    <Name>Villa Joanna</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa Eleonas</Name>
                </Product>
            </Product>
            <Product UserTypeID="Resort">
                <Name>Kassiopi</Name>
                <Product UserTypeID="Property">
                    <Name>Villa 2</Name>
                </Product>
            </Product>
        </Product>
    </Product>
</Products>

<step>
<Products>
    <Product UserTypeID="Country">
        <Name>Cyprus</Name>
        <Product UserTypeID="Resort">
            <Name>Argaka</Name>
            <Product UserTypeID="Property">
                <Name>Villa Jaime</Name>
            </Product>
        </Product>
    </Product>
    <Product UserTypeID="Country">
        <Name>Greece</Name>
        <Product UserTypeID="Region">
            <Name>Corfu</Name>
            <Product UserTypeID="Resort">
                <Name>Acharavi</Name>
                <Product UserTypeID="Property">
                    <Name>Villa 1</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa 2</Name>
                </Product>
            </Product>
            <Product UserTypeID="Resort">
                <Name>Gouvia</Name>
                <Product UserTypeID="Property">
                    <Name>Villa De Bono</Name>
                </Product>
            </Product>
            <Product UserTypeID="Resort">
                <Name>Kassiopi</Name>
                <Product UserTypeID="Property">
                    <Name>Villa 1</Name>
                </Product>
            </Product>
        </Product>
    </Product>
</Products>

<step>
<Products>
    <Product UserTypeID="Country">
        <Name>Cyprus</Name>
        <Product UserTypeID="Resort">
            <Name>Aghia Marina</Name>
            <Product UserTypeID="Property">
                <Name>Villa Aghia Marina</Name>
            </Product>
        </Product>
        <Product UserTypeID="Resort">
            <Name>Coral Bay</Name>
            <Product UserTypeID="Property">
                <Name>Ascos Coral Villas</Name>
            </Product>
            <Product UserTypeID="Property">
                <Name>Coral Villa</Name>
            </Product>
            <Product UserTypeID="Property">
                <Name>Lella Villas</Name>
            </Product>
        </Product>
    </Product>
    <Product UserTypeID="Country">
        <Name>Greece</Name>
        <Product UserTypeID="Region">
            <Name>Corfu</Name>
            <Product UserTypeID="Resort">
                <Name>Acharavi</Name>
                <Product UserTypeID="Property">
                    <Name>Villa Angelos</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa Eleonas</Name>
                </Product>
            </Product>
            <Product UserTypeID="Resort">
                <Name>Aghios Stefanos</Name>
                <Product UserTypeID="Property">
                    <Name>Villa Joanna</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa Eleonas</Name>
                </Product>
            </Product>
            <Product UserTypeID="Resort">
                <Name>Kassiopi</Name>
                <Product UserTypeID="Property">
                    <Name>Villa Imerolia</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Test Property</Name>
                </Product>
            </Product>
        </Product>
    </Product>
</Products>

Each file has the same products (by ./name) but with differing sub products (by ./name) and I need to concatenate them into one tree with one product per product/name, containing all sub products on the same rules so that I can output one structure.

I have an xslt method found, that will create a node set as below

    <xsl:variable name="step-output">
    <xsl:for-each select="/index/file">
        <xsl:copy-of select="document(.)" />
    </xsl:for-each>
</xsl:variable>
<xsl:variable name="step-products" select="exsl:node-set($step-output)//Products" />

But this, when I create other templates will create three products by product/name, i.e. cyprus will turn up three times.

Does anyone know how to do what I'm after?? My result needs to be as follows

<step>
<Products>
    <Product UserTypeID="Country">
        <Name>Cyprus</Name>
        <Product UserTypeID="Resort">
            <Name>Aghia Marina</Name>
            <Product UserTypeID="Property">
                <Name>Villa Aghia Marina</Name>
            </Product>
        </Product>
        <Product UserTypeID="Resort">
            <Name>Argaka</Name>
            <Product UserTypeID="Property">
                <Name>Villa Jaime</Name>
            </Product>
            <Product UserTypeID="Property">
                <Name>Villa Tester</Name>
            </Product>
        </Product>
        <Product UserTypeID="Resort">
            <Name>Coral Bay</Name>
            <Product UserTypeID="Property">
                <Name>Ascos Coral Villas</Name>
            </Product>
            <Product UserTypeID="Property">
                <Name>Coral Villa</Name>
            </Product>
            <Product UserTypeID="Property">
                <Name>Lella Villas</Name>
            </Product>
            <Product UserTypeID="Property">
                <Name>1</Name>
            </Product>
            <Product UserTypeID="Property">
                <Name>2</Name>
            </Product>
        </Product>
    </Product>
    <Product UserTypeID="Country">
        <Name>Greece</Name>
        <Product UserTypeID="Region">
            <Name>Corfu</Name>
            <Product UserTypeID="Resort">
                <Name>Acharavi</Name>
                <Product UserTypeID="Property">
                    <Name>Villa Angelos</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa Eleonas</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa 1</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa 2</Name>
                </Product>
            </Product>
            <Product UserTypeID="Resort">
                <Name>Aghios Stefanos</Name>
                <Product UserTypeID="Property">
                    <Name>Villa Joanna</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa Eleonas</Name>
                </Product>
            </Product>
            <Product UserTypeID="Resort">
                <Name>Gouvia</Name>
                <Product UserTypeID="Property">
                    <Name>Villa De Bono</Name>
                </Product>
            </Product>
            <Product UserTypeID="Resort">
                <Name>Kassiopi</Name>
                <Product UserTypeID="Property">
                    <Name>Villa Imerolia</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Test Property</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa 1</Name>
                </Product>
                <Product UserTypeID="Property">
                    <Name>Villa 2</Name>
                </Product>
            </Product>
        </Product>
    </Product>
</Products>

A: 

Editing the text to create your files will work, but may be hard to maintain.

The easiest way would be to parse the XML of all 3 files into object form. Programmatically add the objects under a single parent node then regenerate a new XML file.

Does your environment make this an acceptable solution?

Jon Winstanley
@Jon, you mean .NET or something .. ?as per his question and tags mentioned he wants to do it using XSLT ..
infant programmer
I presumed that there was some server side code involved. .Net, Python, PHP etc. However, if this is not the case then my solution would not work.
Jon Winstanley
+4  A: 

Here is an XSLT 2.0 stylesheet that should do the job:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0">

  <xsl:output indent="yes"/>

  <xsl:template match="/">
    <step>
      <Products>
        <xsl:for-each-group select="document(index/file)/step/Products/Product" group-by="Name">
          <Product UserTypeID="{@UserTypeID}">
            <Name><xsl:value-of select="current-grouping-key()"/></Name>
            <xsl:for-each-group select="current-group()/Product" group-by="Name">
              <xsl:sort select="current-grouping-key()"/>
              <Product UserTypeID="{@UserTypeID}">
                <Name><xsl:value-of select="current-grouping-key()"/></Name>
                <xsl:for-each select="current-group()/Product">
                  <xsl:sort select="Name"/>
                  <xsl:copy-of select="."/>
                </xsl:for-each>
              </Product>
            </xsl:for-each-group>
          </Product>
        </xsl:for-each-group>
      </Products>
    </step>
  </xsl:template>

</xsl:stylesheet>

You need to run it against an index XML document with the structure

<index>
  <file>test2010020803.xml</file>
  <file>test2010020804.xml</file>
  <file>test2010020805.xml</file>
</index>

that lists the other files you want to process.

XSLT 2.0 stylesheets can be executed with Saxon 9 which comes in a .NET and a Java version so it runs everywhere where either at least Java 1.5 or .NET 2.0 is available or can be installed. Other options are AltovaXML tools (Windows only) and Gestalt.

If you are tied to XSLT 1.0 then you can do it as follows, as long as you have exsl:node-set or similar support:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:exsl="http://exslt.org/common"
  exclude-result-prefixes="exsl"
  version="1.0">

  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:key name="k1" match="step/Products/Product" use="Name"/>
  <xsl:key name="k2" match="step/Products/Product/Product" use="concat(../Name, '|', Name)"/>

  <xsl:template match="/">
    <xsl:variable name="rtf">
      <xsl:copy-of select="document(index/file)/*"/>
    </xsl:variable>
    <step>
      <Products>
        <xsl:for-each select="exsl:node-set($rtf)/step/Products/Product[generate-id() = generate-id(key('k1', Name)[1])]">
          <Product UserTypeID="{@UserTypeID}">
            <xsl:copy-of select="Name"/>
            <xsl:for-each select="key('k1', Name)/Product[generate-id() = generate-id(key('k2', concat(../Name, '|', Name))[1])]">
              <xsl:sort select="Name"/>
              <Product UserTypeID="{@UserTypeID}">
                <xsl:copy-of select="Name"/>
                <xsl:for-each select="key('k2', concat(../Name, '|', Name))/Product">
                  <xsl:sort select="Name"/>
                  <xsl:copy-of select="."/>
                </xsl:for-each>
              </Product>
            </xsl:for-each>
          </Product>
        </xsl:for-each>
      </Products>
    </step>
  </xsl:template>

</xsl:stylesheet>

The keys would look like this:

  <xsl:key name="k1" match="step/Products/Product" use="Name"/>

  <xsl:key name="k2" match="step/Products/Product/Product" use="concat(../Name, '|', Name)"/>

  <xsl:key name="k3" match="step/Products/Product/Product/Product"
                     use="concat(../../Name, '|', ../Name, '|', Name)"/>

  <xsl:key name="k4" 
           match="step/Products/Product/Product/Product/Product"
           use="concat(../../../Name, '|', ../../Name, '|', ../Name, '|', Name)"/>

  <xsl:key name="k5" 
           match="step/Products/Product/Product/Product/Product/Product"
           use="concat(../../../../Name, '|', ../../../Name, '|', ../../Name, '|', ../Name, '|', Name)"/>

  <xsl:key name="k6" 
           match="step/Products/Product/Product/Product/Product/Product/Product"
           use="concat(../../../../../Name, '|', ../../../../Name, '|', ../../../Name, '|', ../../Name, '|', ../Name, '|', Name)"/>

That is all typed directly here in the forum editor so could have bugs.

Martin Honnen
+1. Regarding the use of the `document()` function: The spec states that if the argument is a node-set, it is converted to `string` beforehand, which means that the string value of all nodes after the first one is lost. What am I missing?
Tomalak
The XSLT 1.0 specification at http://www.w3.org/TR/xslt#document states: "When the document function has exactly one argument and the argument is a node-set, then the result is the union, for each node in the argument node-set, of the result of calling the document function with the first argument being the string-value of the node, and the second argument being a node-set with the node as its only member."So the result of doing document(index/file) where index/file is a node-set of three nodes is the union of loading each of those three files.
Martin Honnen
Nice! I've overlooked that. Really good to know, thanks.
Tomalak
In your XSLT 1.0 code, exsl:node-set is useless. You may use directly document() in the for-each:<xsl:for-each select="document(index/file)/step/Products/Product[generate-id() = generate-id(key('k1', Name)[1])]">
Erlock
The xslt 1.0 version worked a treat! Thanks for your help
Designermonkey
Erlock, you are wrong. Muenchian grouping relies on keys and keys work per input tree only so the solution first needs to create a result tree fragment where all input nodes from different input trees are copied to one tree (fragment), then exsl:node-set is needed to convert the result tree fragment into a node-set and then you can use the key to group all those nodes in one tree. If I don't build a result tree fragment first I can't group the nodes with Muenchian grouping. And once I have a result tree fragment I also need exsl:node-set to process those nodes with for-each.
Martin Honnen
Looking at the XLST 1.0 template idea above, would it be possible for this to be recursive to up to 6 levels deep in the initlai structure sample provided?
Designermonkey
With XSLT 2.0 you could rather easily write a recursive function that should work for any level of nesting (as long as the recursion is not causing a stack overflow).With XSLT 1.0 if you want to group six levels you need to define six keys and I am not sure it is then possible to write a recursive template doing the grouping for those six levels as each time you need a different expression for the key value. It might however be possible to write an XSLT 1.0 stylesheet that generates a second stylesheet with the code for grouping a certain number of levels.
Martin Honnen
Thanks Martin, I've written 6 named templates that would use the keys you describe, and from testing with two (using the provided keys) I can say it works for some levels then breaks a little. Looking at the keys above, and their implementation in the templates, would you know what key values and expressions to write for the 6 levels? If so, could you post them?? I'm really new to the syntax and struggle a little to understand what to write.
Designermonkey
I will edit my answer as I am not sure how to properly format code samples in a comment.
Martin Honnen
Thanks very much for that, works a treat. Thinking on, if the xml had attributes with unique id's per product node i.e. id="" and parentid="" would it make the above a lot simpler?
Designermonkey