views:

34

answers:

1

I have a bunch of XML files with a fixed, country-based naming schema: report_en.xml, report_de.xml, report_fr.xml, etc. Now I want to write an XSLT style sheet that reads each of these files via the document() XPath function, extracts some values and generates one XML files with a summary. My question is: How can I iterate over the source files without knowing the exact names of the files I will process?

At the moment I'm planning to generate an auxiliary XML file that holds all the file names and use the auxiliary XML file in my stylesheet to iterate. The the file list will be generated with a small PHP or bash script. Are there better alternatives?

I am aware of XProc, but investing much time into it is not an option for me at the moment. Maybe someone can post an XProc solution. Preferably the solution includes workflow steps where the reports are downloaded as HTML and tidied up :)

I will be using Saxon as my XSLT processor, so if there are Saxon-specific extensions I can use, these would also be OK.

+1  A: 

You can use the standard XPath 2.x collection() function, as implemented in Saxon 9.x

The Saxon implementation allows a search pattern to be used in the string-Uri argument of the function, thus you may be able to specify after the path of the directory a pattern for any filename starting with report_ then having two other characters, then ending with .xml.

Example:

This XPath expression:

collection('file:///c:/?select=report_*.xml')

selects the document nodes of every XML document that resides in c:\ in a file with name starting with report_ then having a 0 or more characters, then ending with .xml.

Dimitre Novatchev