tags:

views:

190

answers:

3

I have tons of XML files all containing a the same XML Document, but with different values. But the structure is the same for each file.

Inside this file I have a datetime field.

What is the best, most efficient way to query these XML files? So I can retrieve for example... All files where the datetime field = today's date?

I'm using C# and .net v2. Should I be using XML objects to achieve this or text in file search routines?

Some code examples would be great... or just the general theory, anything would help, thanks...

+1  A: 

You might look into running XSL queries. See also XSLT Tutorial, XML transformation using Xslt in C#, How to query XML with an XPath expression by using Visual C#.

This question also relates to another on Stack Overflow: Parse multiple XML files with ASP.NET (C#) and return those with particular element. The accepted answer there, though, suggests using Linq.

Sarah Vessels
Thanks that example seems to work with 1 XML document, wonder if there is a way to work with multiple docs, to get a list of either the documents or at least their file names?
JL
+2  A: 

This depends on the size of those files, and how complex the data actually is. As far as I understand the question, for this kind of XML data, using an XPath query and going through all the files might be the best approach, possibly caching the files in order to lessen the parsing overhead.

Have a look at: XPathDocument, XmlDocument classes and XPath queries

http://support.microsoft.com/kb/317069

Something like this should do (not tested though):

XmlNamespaceManager nsmgr = new XmlNamespaceManager(new NameTable());
// if required, add your namespace prefixes here to nsmgr
XPathExpression expression = XPathExpression.Compile("//element[@date='20090101']", nsmgr); // your query as XPath
foreach (string fileName in Directory.GetFiles("PathToXmlFiles", "*.xml")) {
    XPathDocument doc;
    using (XmlTextReader reader = new XmlTextReader(fileName, nsmgr.NameTable)) {
     doc = new XPathDocument(reader);
    }
    if (doc.CreateNavigator().SelectSingleNode(expression) != null) {
     // matching document found
    }
}

Note: while you can also load a XPathDocument directly from a URI/path, using the reader makes sure that the same nametable is being used as the one used to compile the XPath query. If a different nametable was being used, you'd not get results from the query.

Lucero
Thanks this is a great answer... exactly what I needed...
JL
Thanks for accepting it. However, I wonder by whom and why I got downvoted...
Lucero
Amazingly you got downvoted by me, and not on purpose either, I made a false click, but when I tried to correct this mistake, I got a message : Vote too old to be changed.... So I guess this is a stackoverflow bug. Sorry was really not intentional, thanks for all the help for this answer, my project should complete on time....
JL
+1  A: 

If it is at all possible to move to C# 3.0 / .NET 3.5, LINQ-to-XML would be by far the easiest option.

With .NET 2.0, you're stuck with either XML objects or XSL.

Rik
Sorry - no, stuck with .net v2
JL