tags:

views:

1109

answers:

4

I have several xml files, the names of which are stored in another xml file.

I want to use xsl to produce a summary of the combination of the xml files. I remember there was a way to do this with the msxml extensions (I'm using msxml).

I know I can get the content of each file using select="document(filename)" but I'm not sure how to combine all these documents into one.

21-Oct-08 I should have mentioned that I want to do further processing on the combined xml, so it is not sufficient to just output it from the transform, I need to store it as a node set in a variable.

A: 

Have a look at the document() function documentation.

You can use document() to load further XML documents during the transformation process. They are loaded as node sets. That means you would initially feed the XML that contains the file names to load to the XSLT, and take it from there:

<xsl:copy-of select="document(@href)/"/>
Tomalak
Thanks for that. I need to add some extra content at the beginning of each file's xml to identify which file it was from, so document() doesn't give me enough control. Thanks anyway, as I wasn't aware of those extensions to document().
Richard A
+2  A: 

Here is just a small example of what you could do:

file1.xml:

<foo>
<bar>Text from file1</bar>
</foo>

file2.xml:

<foo>
<bar>Text from file2</bar>
</foo>

index.xml:

<index>
<filename>file1.xml</filename>
<filename>file2.xml</filename>

summarize.xsl:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:exsl="http://exslt.org/common"
    extension-element-prefixes="exsl">

  <xsl:variable name="big-doc-rtf">
      <xsl:for-each select="/index/filename">
        <xsl:copy-of select="document(.)"/>
      </xsl:for-each>
  </xsl:variable>

  <xsl:variable name="big-doc" select="exsl:node-set($big-doc-rtf)"/>

  <xsl:template match="/">
    <xsl:element name="summary">
      <xsl:apply-templates select="$big-doc/foo"/>
    </xsl:element>  
  </xsl:template>

  <xsl:template match="foo">
    <xsl:element name="text">
      <xsl:value-of select="bar"/>
    </xsl:element>  
  </xsl:template>

</xsl:stylesheet>

Applying the stylesheet to index.xml gives you:

<?xml version="1.0" encoding="UTF-8"?><summary><text>Text from file1</text><text>Text from file2</text></summary>

The trick is to load the different documents with the document function (extension function supported by almost all XSLT 1.0 processors), to output the contents as part of a variable body and then to convert the variable to a node-set for further processing.

GerG
Thanks. I knew this was the approach, but couldn't remember how to do it. I'm now using the msxml node-set function instead of the exslt one you suggest (I know, I'm a heretic) and getting somewhere.
Richard A
A: 

Assume that you have the filenames listed in a file like this:

<files>
    <file>a.xml</file>
    <file>b.xml</file>
</files>

Then you could use a stylesheet like this on the above file:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <xsl:template match="/">
     <root>
      <xsl:apply-templates select="files/file"/>       
     </root>
    </xsl:template>

    <xsl:template match="file">
     <xsl:copy-of select="document(.)"/>
    </xsl:template>
</xsl:stylesheet>
Mingus Rude
This is what I initially tried, but as far as I can see, I can't use this technique to put the xml into a variable that I can further process. I get an error saying that the selection is not a node set.
Richard A
A: 

Thanks for all the answers. Here's the guts of the solution I'm using with msxml.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ms="urn:schemas-microsoft-com:xslt">
  <xsl:output method="xml"/>
  <xsl:template match="/">
    <xsl:variable name="combined">
      <xsl:apply-templates select="files"/>
    </xsl:variable>
    <xsl:copy-of select="ms:node-set($combined)"/>
  </xsl:template>
  <xsl:template match="files">
    <multifile>
      <xsl:apply-templates select="file"/>
    </multifile>
  </xsl:template>
  <xsl:template match="file">
    <xsl:copy-of select="document(@name)"/>
  </xsl:template>
</xsl:stylesheet>

Now I'm trying to improve performance as each file is around 8 MB and the transformation is taking a very long time, but that's another question.

Richard A
I realize you only published the 'guts' of your solution but nevertheless a statement such as <xsl:copy-of select="ms:node-set($combined)"/> does not make a lot of sense since you just converted an RTF to a node-set and back to an RTF. <xsl:copy-of> works just as well with just an RTF as argument.
GerG
To improve performance I'd actually suggest not to use a stylesheet depending on the amount of information you want to extract from the individual files. If you just need to basically concatenate them (with removal of the XML-header) then a simple script in your preferred language should be fine too
GerG
As far as the node-set(), I want the output as a node set that I can do further processing on. When I try to put the output into a select attribute without the node-set() the processor (rightly) complains that it's not a node set.
Richard A
Thanks for the suggestion about just concatenating the files. I was thinking of doing this too. Otherwise, I might process the files through the XMLDOM using Delphi, as there is some other programmatic processing I have to do before and after the transformation.
Richard A