views:

1231

answers:

2

I'm currently working on a Silverlight app and need to convert XML data into appropriate objects to data bind to. The basic class definition for this discussion is:

public class TabularEntry
    {
     public string Tag { get; set; }
     public string Description { get; set; }
     public string Code { get; set; }
     public string UseNote { get; set; }
     public List<string> Excludes { get; set; }
     public List<string> Includes { get; set; }
     public List<string> Synonyms { get; set; }
     public string Flags { get; set; }
     public List<TabularEntry> SubEntries { get; set; }
    }

An example of the XML that might come in to feed this object follows:

<I4 Ref="1">222.2
    <DX>Prostate</DX>
    <EX>
     <I>adenomatous hyperplasia of prostate (600.20-600.21)</I>
     <I>prostatic:
      <I>adenoma (600.20-600.21)</I>
      <I>enlargement (600.00-600.01)</I>
      <I>hypertrophy (600.00-600.01)</I>
     </I>
    </EX>
    <FL>M</FL>
</I4>

So, various nodes map to specific properties. The key ones for this question are the <EX> and <I> nodes. The <EX> nodes will contain a collection of one or more <I> nodes and in this example matches up to the 'Excludes' property in the above class definition.

Here comes the challenge (for me). I don't have control over the web service that emits this XML, so changing it isn't an option. You'll notice that in this example one <I> node also contains another collection of one or more <I> nodes. I'm hoping that I could use a LINQ to XML query that will allow me to consolidate both levels into a single collection and will use a character that will delimit the lower level items, so in this example, when the LINQ query returned a TablularEntry object, it would contain a collection of Exclude items that would appear as follows:

  • adenomatous hyperplasia of prostate (600.20-600.21)
  • prostatic:
  • *adenoma (600.20-600.21)
  • *enlargement (600.00-600.01)
  • *hypertrophy (600.00-600.01)

So, in the XML the last 3 entries are actually child objects of the second entry, but in the object's Excludes property, they are all part of the same collection, with the former child objects containing an identifier character/string.

I have the beginnings of the LINQ query I'm using below, I can't quite figure out the bit that will consolidate the child objects for me. The code as it exists right now is:

List<TabularEntry> GetTabularEntries(XElement source)
     {
      List<TabularEntry> result;

      result = (from tabularentry in source.Elements()
          select new TabularEntry()
          {
           Tag = tabularentry.Name.ToString(),
           Description = tabularentry.Element("DX").ToString(),
           Code = tabularentry.FirstNode.ToString(),
           UseNote = tabularentry.Element("UN") == null ? null : tabularentry.Element("UN").Value,
           Excludes = (from i in tabularentry.Element("EX").Elements("I")
               select i.Value).ToList()
          }).ToList();

      return result;
     }

I'm thinking that I need to nest a FROM statement inside the

Excludes = (from i...)

statement to gather up the child nodes, but can't quite work it through. Of course, that may be because I'm off in the weeds a bit on my logic.

If you need more info to answer, feel free to ask.

Thanks in advance,

Steve

A: 

Descendants will get you all of the I children. The FirstNode will help seperate the value of prostatic: from the values of its children. The there's a return character in the value of prostatic:, which I removed with Trim.

XElement x = XElement.Parse(@"
<EX>
  <I>adenomatous hyperplasia of prostate (600.20-600.21)</I>
  <I>prostatic:
    <I>adenoma (600.20-600.21)</I>
    <I>enlargement (600.00-600.01)</I>
    <I>hypertrophy (600.00-600.01)</I>
  </I>
</EX>");
//
List<string> result = x
  .Descendants(@"I")
  .Select(i => i.FirstNode.ToString().Trim())
  .ToList();

Here's a hacky way to get those asterisks in. I don't have time to improve it.

List<string> result2 = x
  .Descendants(@"I")
  .Select(i =>
    new string(Enumerable.Repeat('*', i.Ancestors(@"I").Count()).ToArray())
    + i.FirstNode.ToString().Trim())
  .ToList();
David B
+2  A: 

Try this:

    List<TabularEntry> GetTabularEntries(XElement source)
    {
        List<TabularEntry> result;

        result = (from tabularentry in source.Elements()
                  select new TabularEntry()
                  {
                      Tag = tabularentry.Name.ToString(),
                      Description = tabularentry.Element("DX").ToString(),
                      Code = tabularentry.FirstNode.ToString(),
                      UseNote = tabularentry.Element("UN") == null ? null : tabularentry.Element("UN").Value,
                      Excludes = (from i in tabularentry.Element("EX").Descendants("I")
                                  select (i.Parent.Name == "I" ? "*" + i.Value : i.Value)).ToList()

                  }).ToList();

        return result;
    }

(edit)

If you need the current nested level of "I" you could do something like:

    List<TabularEntry> GetTabularEntries(XElement source)
    {
        List<TabularEntry> result;

        result = (from tabularentry in source.Elements()
                  select new TabularEntry()
                  {
                      Tag = tabularentry.Name.ToString(),
                      Description = tabularentry.Element("DX").ToString(),
                      Code = tabularentry.FirstNode.ToString(),
                      UseNote = tabularentry.Element("UN") == null ? null : tabularentry.Element("UN").Value,
                      Excludes = (from i in tabularentry.Element("EX").Descendants("I")
                                  select (ElementWithPrefix(i, '*'))).ToList()

                  }).ToList();

        return result;
    }

    string ElementWithPrefix(XElement element, char c)
    {
        string prefix = "";
        for (XElement e = element.Parent; e.Name == "I"; e = e.Parent)
        {
            prefix += c;
        }
        return prefix + ExtractTextValue(element);
    }

    string ExtractTextValue(XElement element)
    {
        if (element.HasElements)
        {
            return element.Value.Split(new[] { '\n' })[0].Trim();
        }
        else
            return element.Value.Trim();
    }

Input:

<EX>
 <I>adenomatous hyperplasia of prostate (600.20-600.21)</I>
 <I>prostatic:
  <I>adenoma (600.20-600.21)</I>
  <I>enlargement (600.00-600.01)</I>
  <I>hypertrophy (600.00-600.01)
   <I>Bla1</I>
   <I>Bla2
    <I>BlaBla1</I>
   </I>
   <I>Bla3</I>
  </I>
            </I>
</EX>

Result:

* adenomatous hyperplasia of prostate (600.20-600.21)
* prostatic:
* *adenoma (600.20-600.21)
* *enlargement (600.00-600.01)
* *hypertrophy (600.00-600.01)
* **Bla1
* **Bla2
* ***BlaBla1
* **Bla3
bruno conde
Thanks Bruno. I just got called in to deal with an issue in another product. I will try this ASAP.
Steve Brouillard
Bruno. I used your top example. One point of note. Though it was not clear from my example, there could actually be a collection of 0 or more <EX> nodes in this stream, so I had to use the .Elements (plural) syntax rather than .Element (singular) to make this work. Just an FYI for others.
Steve Brouillard