tags:

views:

714

answers:

17

I'm a beginner when it comes to XML. I created a simple XML file and tried to parse it and assign the values into variables. It worked but the method I used made me wonder if there're better ways, more elegant if you will, for this task. Are there any?

Here's my XML file:

<start>
<record>
<var1>hello</var1>
<var2>world</var2>
</record>
<record>
<var1>another</var1>
<var2>one</var2>
</record>
</start>

Here's the method I used:

string var1 = "", var2 = "";

using(XmlReader r = XmlReader.Create(file))
{
    while(r.Read())
    {
     if (r.MoveToContent() == XmlNodeType.Element)
     {
      if(r.Name == "record")
      {
       var1 = "";
       var2 = "";
      }
      else if(r.Name = "var1")
       var1 = r.ReadElementString();
      else if(r.Name = "var2")
       var2 = r.ReadElementString();
     }
     else if(r.MoveToContent() == XmlNodeType.EndElement && r.Name == "record")
     {
      Console.WriteLine(var1 + " " + var2);
     }
    }

}
+1  A: 

I prefer the XmlDocument route:

here's an excellent article on CodeProject:

http://www.codeproject.com/KB/cpp/parsefilecode.aspx

tekBlues
+2  A: 

There are several ways to do it. http://stackoverflow.com/questions/55828/best-practices-to-parse-xml-files-with-c

Byron Whitlock
+3  A: 

One option would be to load the whole document into an XmlDocument and use Xpath syntax to extract out the values - good for small(ish) documents, but not good for large documents as you will have a memory overhead and you will have parsed the entire document instead of perhaps just the data you want. Something like this would work (with any error checking removed for a bit of clarity):

XmlDocument doc = new XmlDocument();
doc.Load(filename);

XmlNodeList records = doc.SelectNodes("/start/record");
foreach(XmlNode n : records)
{
   string var1 = n.SelectSingleNode("var1").InnerText;
   string var2 = n.SelectSingleNode("var2").InnerText;
}
Alan Moore
+6  A: 

Did you try Linq To XML ?

I am pretty new to it so this code could probably be done better, but I think this works:

XElement xmlData = XElement.Load("XmlFile.xml");
        string var1, var2;
        foreach (XElement element in xmlData.Elements("record"))
        {
            var1 = element.Element("var1").Value;
            var2 = element.Element("var2").Value;

            Console.WriteLine(var1 + " " + var2);

        }
orandov
+1  A: 

Well, I found XmlDocument super easy but has it's own type of overhead. Here is a very nice article that I found quite useful.

http://support.softartisans.com/kbview.aspx?ID=673

paradisonoir
Yes, of course is not the faster method, just it's simpler and cleaner to use IMHO
tekBlues
+4  A: 

Real men don't parse, they deserialize.

Wyatt Barnett
Good one! there are few real men out there anyway
tekBlues
+3  A: 

I would suggest choosing the Serialize/DeSerialize option. This would be more dynamic and less error-prone. This way you need to maintain your code a lot for every single change.

paradisonoir
+1  A: 

I really like the Linq-way using XDocument or XElement.. That is just brilliant, especially if you are writing XML too.

Moulde
+4  A: 

Here's a version using VB's XML literal features.

Dim doc = XDocument.Load(file)
For Each element In doc...<record>
    Dim var1 = element.<var1>.Single()
    Dim var2 = element.<var2>.Single()
    Console.WriteLine(var1.Value & var2.Value)
Next
JaredPar
+3  A: 

I'll second the call for XmlSerializer. Write an XML Schema for your format, generate serializer classes from that, and you get convenient interface, and automatic validation, all for free. Its performance is also pretty good.

Pavel Minaev
+3  A: 

Oh god you should definitely consider XmlDocument, LINQ to XML, or XmlSerializer as others have suggested. I've used XmlReader and it's not for the faint-hearted. It has the advantage of being the only one of the 4 options that can read a huge XML file without reading the whole thing into memory, but if that is not a concern, then you would do yourself a huge favor by using one of the more intuitive API's.

If you have a say over the format of the XML I highly recomment the XmlSerializer approach as it takes the whole XML-ness out of the equation. You have some level of control over how the objects will be formatted by using simple attributes on your properties and classes. But otherwise, LINQ to XML is the next easiest.

Josh Einstein
+2  A: 

I usually just deserialize the XML into a set of objects that i can then loop through and process. This would require creating serealizable objects that will match the schema of your xml structure. If you don't want to do this, Visual Studio ships with a nice little feature called XSD which allows you to generate class objects from your XML file. You can run XSD from the Visual Studio Command Prompt. If you're interested, this is how you'd go about it:

Run the following command: XSD path_to_your_xml.xml /o:your_output_directory This will generate the schema for the XML.

Once you have that, you generate class objects by doing this: XSD path_to_your_schema.xsd /c /l:cs /o:your_output_directory This will generate the .cs file with a set of classes necessary to deserialiaze your XML file.

The only thing about this method is that it uses arrays for collections. I usually change them to lists. Just a personal preference. That should be it. All that's left to do is write you little function that deserializes your xml. You can type in "xsd /?" to view the list of other parameters that may interest you.

Sergey
A: 

I've written an extension to ANTLR 2.x for XML parsing called ANTXR. Makes it really easy to pick out the parts of the XML that you care about, whether tag contents or attributes.

See http://javadude.com/tools/antxr/index.html.

Scott Stanchfield
A: 

Thanks for the all answers. I learned XmlDocument and Linq to XML but I'm kind of scratching my head about deserialize method.

I used Xsd.exe to create a class file for my XML file. Here's the XML file:

<tmp>
    <person type="1">
     <name>a</name>
     <lastname>e</lastname>
    </person>
    <person type="1">
     <name>x</name>
     <lastname>y</lastname>
    </person>
    <person2 name="aa" lastname="ee" />
</tmp>

In the end I have tmp, person and person2 classes in the generated file. The only way I could think of to read values is this:

XmlSerializer s = new XmlSerializer(typeof(tmp));
tmp t = (tmp)s.Deserialize(new StreamReader(xmlfile));

foreach (object i in t.Items)
{
    Type type = i.GetType();

    if (type == typeof(person))
        string name = ((person)i).name;

    else if (type == typeof(person2))
        string name = ((person2)i).name;
}

Am I doing it right or am I missing something as a deserialize newbie?

Armagan
+2  A: 

As Steve Balmer would say - "Deserialize!, Deserialize!, Deserialize!"

Vidar
A: 

I like the serialization/deserialization option, which several people have suggested.

should deserialize into something like this:

[Serializable]
class record {
    string var1 { get; set; }
    string var2 { get; set; }
}

And to deserialize an array of elements, you'll need XmlArrayItemAttribute - look it this class on MSDN.

HTH!

azheglov