tags:

views:

28

answers:

1

Hello, i have several hundred XML files, which are 2kb each, so they are small, but i need to combine all of them into one because i need to cross referance the info in them with a database that i have

each file contains a specific case number along with other non important stuff

is there anyway i can combine all those files into ONE xml file and for it to exclude all the not needed info besides Case Number: 123456 in every file?

A: 

If I read the question correctly, you're wanting to combine all the xml files that have the case number "123456" into a single XML file, right?

If so, you can use the collection() function in XSLT or XQuery to point to a directory that contains the XML files.

Here are 3 test XML files that I put in my 'C:\test_xml' directory. Two of them have the "123456" case number and one of them doesn't:

File 1:

<?xml version="1.0" encoding="UTF-8"?>
<doc>
    <case>123456</case>
    <blah>test 1 file</blah>
</doc>

File 2:

<?xml version="1.0" encoding="UTF-8"?>
<doc>
    <case>abcdef</case>
    <blah>test 2 file</blah>
</doc>

File 3:

<?xml version="1.0" encoding="UTF-8"?>
<doc>
    <case>123456</case>
    <blah>test 3 file</blah>
</doc>

Using either the XSLT or XQuery below:

XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/">
        <collection>
            <xsl:for-each select="collection('file:///C:/test_xml?*.xml')/doc[case='123456']">
                <xsl:copy>
                    <xsl:apply-templates select="node()|@*"/>
                </xsl:copy>
            </xsl:for-each>
        </collection>
    </xsl:template>

</xsl:stylesheet>

XQuery:

<collection>
{
for $file in collection('file:///C:/test_xml?*.xml')/doc[case='123456']
return
    $file
}
</collection>

produces the following output:

Output:

<?xml version="1.0" encoding="UTF-8"?>
<collection>
   <doc>
      <case>123456</case>
      <blah>test 1 file</blah>
   </doc>
   <doc>
      <case>123456</case>
      <blah>test 3 file</blah>
   </doc>
</collection>

I used Saxon-HE (free home version) to do the processing. Also, the XQuery was about 8ms faster than the XSLT.

DevNull