views:

63

answers:

2

Hi

I have a situation where I receive an XML (document) file from an external company. I need to filter the document to remove all data I am not interested in. The file is about 500KB but will be requested very often.

let say the following file:

<dvdlist>
  <dvd>
    <title>title 1</title>
    <director>directory 2</director>
    <price>1</price>
    <location>
      <city>denver</city>
    </location>
  </dvd>
  <dvd>
    <title>title 2</title>
    <director>directory 2</director>
    <price>2</price>
    <location>
      <city>london</city>
    </location>
  </dvd>
  <dvd>
    <title>title 3</title>
    <director>directory 3</director>
    <price>3</price>
    <location>
      <city>london</city>
    </location>
  </dvd>
</dvdlist>

What I need is simply filter the document based on the city = london in order to end up with this new XML document

<dvdlist>
  <dvd>
    <title>title 2</title>
    <director>directory 2</director>
    <price>2</price>
    <location>
      <city>london</city>
    </location>
  </dvd>
  <dvd>
    <title>title 3</title>
    <director>directory 3</director>
    <price>3</price>
    <location>
      <city>london</city>
    </location>
  </dvd>
</dvdlist>

I have tried the following

XmlDocument doc = new XmlDocument();
doc.Load(@"C:\Development\Website\dvds.xml");
XmlNode node = doc.SelectSingleNode("dvdlist/dvd/location/city[text()='london']");

Any help or links will appreciate

Thanks

A: 

Here's an example using LINQ to XML.

//load the document
var document = XDocument.Load(@"C:\Development\Website\dvds.xml");
//get all dvd nodes
var dvds = document.Descendants().Where(node => node.Name == "dvd");
//get all dvd nodes that have a city node with a value of "london"
var londonDVDs = dvds.Where(dvd => dvd.Descendants().Any(child => child.Name == "city" && child.Value == "london"));
DoctaJonez
Thanks DoctaJonez. I went through this tutorial below, as I am new to Linq http://download.microsoft.com/download/c/f/b/cfbbc093-f3b3-4fdb-a170-604db2e29e99/XLinq%20Overview.doc and have the following code which gets me the list I require. string path = @"C:\Development\Website\sp.xml"; var dvds = from d in XElement.Load(path).Elements("dvd") where d.Element("location").Element("city").Value == "london" select d; My question is how to get the data as a new XmlDocument
Walid
If you use LINQ to XML (i.e. XDocument) then you should create a new XDocument, not an XmlDocument. Creating a new XDocument is easy: `XDocument input = XDocument.Load("input.xml"); XDocument output = new XDocument(new XElement(input.Root.Name, input.Root.Elements("dvd").Where(d => (string)d.Element("location").Element("city") == "london"))); output.Save("output.xml");`.
Martin Honnen
Thanks Martin. That was quite useful.
Walid
+1  A: 

XPath is a selection expression language -- it never modifies the XML document(s) it operates on.

Therefore, in order to obtain the desired new XML document, you need to either use XML DOM (not recommended) or apply an XSLT transformation to the XML document. The latter is the recommended way to go, since XSLT is a language especially designed for tree transformations.

In .NET one can use the XslCompiledTransform class and its Transform() method. Read more about these in the relevant MSDN documentation.

The XSLT transformation itself is extremely simple:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="dvd[not(location/city='london')]"/>
</xsl:stylesheet>

Here, you can find a complete code example how to obtain the result of the transformation as an XmlDocument (or if desired, as an XDocument).

Dimitre Novatchev
Thanks Dimitre. I'll have a go at it and will let you know.cheers
Walid
Thanks again Dimitre. In the end I went with your recommendations. It's true XSLT is easy when having knowledge of xpath.
Walid
@Walid: You are welcome.
Dimitre Novatchev