tags:

views:

820

answers:

2

Hi

I have an XmlDocument which I can traverse with XmlNode or convert it to a XDocument and traverse it via LINQ.

<Dataset>
    <Person>
        <PayrollNumber>1234567</PayrollNumber>
        <Surname>Smith-Rodrigez</Surname>
        <Name>John-Jaime-Winston Junior</Name>
        <Skills>
            <Skill>ICP</Skill>
            <Skill>R</Skill>
        </Skills>
        <HomePhone>08 8888 8888</HomePhone> 
        <MobilePhone>041 888 999</MobilePhone>
        <Email>[email protected]</Email>
    </Person>
    <Person>
        <PayrollNumber>12342567</PayrollNumber>
        <Surname>Smith-Rodrigez</Surname>
        <Name>Steve</Name>
        <Skills>
            <Skill>Resus</Skill>
            <Skill>Air</Skill>
        </Skills>
        <HomePhone>08 8888 8888</HomePhone> 
        <MobilePhone>041 888 999</MobilePhone>
        <Email>[email protected]</Email>
    </Person>
</Dataset>

Question 1

I want to convert the Person records/nodes in the XML to a business entity object (POCO). Therefore I have to iterate through a Person node at a time, and then parse the individual values. This last bit is interesting in itself, but first I have to get the actual Person records. The problem I have is that if I select by individual nodes (using say XmlList in XmlDocoment).

I end up aggregating all fields by name. I am concerned to do this in case one of the person nodes is incomplete, or even missing and then I won't know which is missing when I pass through and aggregate the fields in to business objects. I will try and validate - see question 2.

I realize this can be done through reflection but I am interested.

I tried iterating through by Person object:

Option 1:

foreach (XObject o in xDoc.Descendants("Person"))
{
    Console.WriteLine("Name" + o);
    // [...]
}

This gets me 2 person records (correct) each a stringified complete XML doc - formatted as an XML document. Just a subset of the above XML document.

But how to split up the record now into separate nodes or fields - preferably as painless as possible?

Option 2:

foreach (XElement element in xDoc.Descendants("Person"))
{
    // [...]
}

This gets me the XML nodes - values only - for each Person all in one string, e.g.

1234567Smith-RodrigezJohn-Jaime-Winston JuniorLevel 5, City Central Tower 2, 121 King William StNorth Adelaide 5000ICPR08 8888 8888041 888 [email protected]

Again, not much use.

Question 2

I can validate an XDocument quite easily, there are some good examples on MSDN, but I'd like to know how can I flag a wrong record. Ideally, I'd like to be able to filter the good records out to a new XDocument on the fly leaving the old ones behind. Is this possible?

+2  A: 

The problem is that you're just printing out the elements as strings. You need to write code to convert a XElement into your business object. Admittedly I'd expect the full XML to be written out instead - are you sure you're not printing out XElement.Value (which concatenates all the descendant text nodes)?

(I'm not sure of the answer to your second question - I suggest you ask it as a separate question here, so that we don't get a mixture of answers in one page.)

Jon Skeet
Apologies - yes I am using Xelement.value - well for the second example at least. (It is very late here and I am very tired). The problem is that I don't want to concatenate the descendent values - I need to organise them seperately but I do want to select them all initially so I can easily organise them as a single object. It's then organising them and splitting them up further but I can't enumerate at this level it seems.
MtTumbledown
I am using:foreach (XElement element in xDoc.Descendants("Person")) { Console.WriteLn(element.value); }}This gives me a concatenated string.
MtTumbledown
@MtTumbledown: Yes, it would do. Don't use the Value property - ask for the subelements separately. Accessing the Value property on XElement *will* concatenate the descendant text nodes, which is what it's meant to do.
Jon Skeet
Sure - I suppose that there is no fast way of doing it - there is an element of hard graft in asking for each sub element - probably by node id.I was hoping there would be some 3.5 LINQ support that would fill a structure with the sub elements automatically. Thanks!
MtTumbledown
I can do this. It gives me a fairly useful list with each Node description followed by it's value :IEnumerable<XNode> nodeList = element.DescendantNodes(); foreach (XNode xn in nodeList) { Console.WriteLine("name:" + xn); }
MtTumbledown
You don't ask for it by node ID, you ask for it by element name. There's XML serialization to do this for you if you really want - but unless you use that, you'll have to do it yourself. LINQ to XML makes it pretty easy, to be honest.
Jon Skeet
A: 

Why not using XML deserialization?

There are two ways to do that.

  • The first one is to modify the business object Person to match the given XML, by adding appropriate attributes to the Person class and its properties. The XML is quite simple, so probably you would just have to change the names if there is no 1:1 match between object properties and XML nodes. For example, you have to specify [XmlArray("Skills")] and [XmlArrayItem("Skill)] for the Skills collection.

  • The second one is to transform the given XML to the one which matches the default serialization of your Person object, then to deserialize.

The second solution will also give you the possibility to filter "bad" records very easily.

MainMa