tags:

views:

167

answers:

5

I have the following XML file

<?xml version="1.0" ?> 
<Persons>
<Person>
    <Id>1</Id>
    <Name>temp</Name>
    <Qlid>1234</Qlid>
    <Manager>3</Manager>
</Person>

<Person>
    <Id>2</Id>
    <Name>someone</Name>
    <Qlid>5678</Qlid>
    <Manager>1</Manager>
</Person>

</Persons>

I am trying to read it in using the following c# function

protected void readXmlFile()
    {
        FileStream fs = new FileStream("C:/Documents and Settings/me/Desktop/chart.xml",FileMode.Open);
        XmlTextReader r = new XmlTextReader(fs);

        //debug
        StringWriter st = new StringWriter();

        List<Person> persons = new List<Person>();

        //Loop through persons in XML
        while (r.Read())
        {
            if (r.NodeType == XmlNodeType.Element && r.Name == "Person")
            {
                Person newPerson = new Person();
                while (r.NodeType != XmlNodeType.EndElement)
                {
                    r.Read();
                    if (r.Name == "Id")
                    {
                        st.Write("67");
                        while (r.NodeType != XmlNodeType.EndElement)
                        {
                            r.Read();
                            if (r.NodeType == XmlNodeType.Text)
                            {
                                newPerson.Id = Int32.Parse(r.Value);
                                st.Write(r.Value);
                            }
                        }
                    }

                    r.Read();
                    if (r.Name == "Name")
                    {
                        while (r.NodeType != XmlNodeType.EndElement)
                        {
                            r.Read();
                            if (r.NodeType == XmlNodeType.Text)
                            {
                                newPerson.Name = (r.Value);
                                st.Write("23");
                            }
                        }
                    }

                    r.Read();
                    if (r.Name == "Qlid")
                    {
                        while (r.NodeType != XmlNodeType.EndElement)
                        {
                            r.Read();
                            if (r.NodeType == XmlNodeType.Text)
                            {
                                newPerson.Qlid = (r.Value);
                                st.Write(r.Value);
                            }
                        }
                    }

                    r.Read();
                    if (r.Name == "Manager")
                    {
                        while (r.NodeType != XmlNodeType.EndElement)
                        {
                            r.Read();
                            if (r.NodeType == XmlNodeType.Text)
                            {
                                newPerson.Manager = Int32.Parse(r.Value);
                                st.Write(r.Value);
                            }
                        }
                    }

                    //add to list
                    persons.Add(newPerson);
                    st.Write(90);
                }


            }
        }

        fs.Close();

the if(r.Name="Id") and similar ifs are never becoming true for some reason, returning empty person classes

+9  A: 

Unless your XML is very large, and can't be loaded into memory in one go, I would almost certainly not go down this route.

The trouble with using an XmlReader for this, over an XmlDocument or an Linq2Xml, is you are missing all the power of XPath or Linq that is designed to do exactly what you are attempting to do here: pick specific nodes out of the xml.

Alternatively, it also looks like you are just deserializing some xml into a dto, your Person. This is exactly what the built in xml serialization does for you. You could achieve this with a few attributes on your dto if you wanted to.

However, with regard to why this specifically isn't working:

You check for a Person element, and then loop round while you look for something that isn't an end element. My guess is that you expected the next node to be the Name element. It isn't. Your next Read() gives you a whitespace node. This throws off all your subsequent tests for the elements you are expecting. This highlights the key problem with what you are doing: this approach is extremely brittle to subtle changes in the xml. Every time you do a Read() blindly, you are assuming you know what the next node is. If it isn't what you expect, your code may fail in ways that it isn't easy to spot until it actually falls over.

Is there a compelling reason not to use another approach? My gut feeling is you would save yourself a lot of hassle if you did!

Rob Levine
+1, this is what makes XMLReaders so hard to use. You often end up having to design a state machine algorithm to get the right results.
John M Gant
that worked. I was assuming whitespace was ignored by XML
anon2
@anon2 - actually it _can_ be ignored, but you need to specify it when you create the XmlTextReader. Rather than using the constructor overload you are using, try XmlReader.Create. This takes an XmlReaderSettings option which has a IgnoreWhitespace property.
Rob Levine
+1  A: 

Your code assumes that the XML file is always going to be in the same format with the data in the following order:

  • Id
  • Name
  • Qlid
  • Manager

You cannot make this assumption.

If you need to use this approach you'll need to restructure your code to loop until you reach <\Person>. Then in your loop switch on r.Name to set the appropriate property of newPerson.

ChrisF
+3  A: 

I'll reiterate what others have been suggesting: use LINQ to XML. It'll be much easier.

However, as to why your current code is failing:

r.Read();
if (r.Name == "Id") { ... }
r.Read();
if (r.Name == "Name") { ... }
// etc

This assumes that it will read the nodes (not just elements - don't forget text nodes etc) in exactly the expected order. You should really just read the nodes until you get to the end of the current element, and react to each node appropriately.

However, here's an example of what the LINQ to XML might look like, just for comparison:

XDocument doc = XDocument.Load("foo.xml")
List<Person> persons = doc.Elements("Person")
                          .Select(x => new Person
                             {
                                 Id = (int) x.Element("Id"),
                                 Name = (string) x.Element("Person"),
                                 Manager = (string) x.Element("Manager")
                                 // etc
                             });
                          .ToList();

Admittedly this currently assumes that there is an Id element, but there are ways of making it more sophisticated. As you can see, it's considerably simpler than using XmlReader.

Jon Skeet
A: 

As @Rob said your code is returning whitespace which is throwing off your tests. You can specify WhiteSpaceHandling on the XmlTextReader which will fix your current issue.

r.WhitespaceHandling = WhitespaceHandling.None;
Andy Robinson
A: 

One may be in need to use an XmlReader, maybe because of memory limitations. In that situation, it may be beneficial to use the XmlReader.ReadSubTree() method, which eliminates the need to check the EndElement node type. Moreover, since SubTree's are smaller in size they may also be converted to XmlDocuments with smaller memory consumptions.

tafa