views:

299

answers:

2

I have two XMLDocuments that contain some similar information but there are other nodes that contain different information between the two.

I am using XMLSerialization to put my data into a structure as shown here

I know you can merge XML files by using a DataSet as shown here but I want to somehow serialize the first document I see into my class and then append the second document to my class structure.

Any ideas how to do that or is there a better approach? On the second document where the information is similar I am happy to overwrite it with the second document data for example each document has a DATE so my Date property can be that of the second document.

Here is the data

<ROOT>
<ID>2</ID>
<PART>4a</PART>
<NAME>JEFF</NAME>
<ADDRESS>
    <ST>10001</ST>
    <ID>123456789</ID>
</ADDRESS>
<PARTNUMBER>001</PARTNUMBER>
<DATE>2009 -06-05T16.18.05</DATE>
</ROOT>


<ROOT>
<ID>2</ID>
<PART>4b</PART>
<NAME>JEFF</NAME>
<RELATIVE>
    <ST>10001</ST>
    <ID>1234567890QWERTYUIOP</ID>
</RELATIVE>
<PARTNUMBER>002</PARTNUMBER>
<DATE>2009 -06-05T16.17.41</DATE>
</ROOT>
+2  A: 

You can do something like that :

void Main()
{
 string xml1 = @"<ROOT>
 <ID>2</ID>
 <PART>4a</PART>
 <NAME>JEFF</NAME>
 <ADDRESS>
  <ST>10001</ST>
  <ID>123456789</ID>
 </ADDRESS>
 <PARTNUMBER>001</PARTNUMBER>
 <DATE>2009 -06-05T16.18.05</DATE>
 </ROOT>";


 string xml2 = @"<ROOT>
 <ID>2</ID>
 <PART>4b</PART>
 <NAME>JEFF</NAME>
 <RELATIVE>
  <ST>10001</ST>
  <ID>1234567890QWERTYUIOP</ID>
 </RELATIVE>
 <PARTNUMBER>002</PARTNUMBER>
 <DATE>2009 -06-05T16.17.41</DATE>
 </ROOT>";

 var doc1 = XDocument.Parse(xml1);
 var doc2 = XDocument.Parse(xml2);

 XDocument doc = MergeDocuments(doc1, doc2);
 doc.Dump();
}

static XDocument MergeDocuments(XDocument doc1, XDocument doc2)
{
 var root = MergeElements(doc1.Root, doc2.Root);
 return new XDocument(root);
}

static XElement MergeElements(XElement e1, XElement e2)
{
 var attrComparer = new XAttributeEqualityComparer();
 var nameComparer = new XNameComparer();

 var attributes = e2.Attributes().Union(e1.Attributes(), attrComparer).Cast<XNode>();

 var elements1 = e1.Elements().OrderBy(e => e.Name, nameComparer).ToArray();
 var elements2 = e2.Elements().OrderBy(e => e.Name, nameComparer).ToArray();
 var elements = new List<XNode>();
 int i1 = 0, i2 = 0;
 while (i1 < elements1.Length && i2 < elements2.Length)
 {
  XElement e = null;
  int compResult = nameComparer.Compare(elements1[i1].Name, elements2[i2].Name);
  if (compResult < 0)
  {
   e = elements1[i1];
   i1++;
  }
  else if (compResult > 0)
  {
   e = elements2[i2];
   i2++;
  }
  else
  {
   e = MergeElements(elements1[i1], elements2[i2]);
   i1++;
   i2++;
  }
  elements.Add(e);
 }
 while (i1 < elements1.Length)
 {
  elements.Add(elements1[i1]);
  i1++;
 }
 while (i2 < elements2.Length)
 {
  elements.Add(elements2[i2]);
  i2++;
 }

 var nodes = attributes.Concat(elements).ToArray();
 string value = null;
 if (elements.Count == 0)
 {
  if (!string.IsNullOrEmpty(e1.Value))
   value = e1.Value;
  if (!string.IsNullOrEmpty(e2.Value))
   value = e2.Value;
 }
 if (value != null)
  return new XElement(e1.Name, nodes, value);
 else
  return new XElement(e1.Name, nodes);
}

class XNameComparer : IComparer<XName>
{
 public int Compare(XName x, XName y)
 {
  int result = string.Compare(x.Namespace.NamespaceName, y.Namespace.NamespaceName);
  if (result == 0)
   result = string.Compare(x.LocalName, y.LocalName);
  return result;
 }
}

class XAttributeEqualityComparer : IEqualityComparer<XAttribute>
{
 public bool Equals(XAttribute x, XAttribute y)
 {
  return x.Name == y.Name;
 }

 public int GetHashCode(XAttribute x)
 {
  return x.Name.GetHashCode();
 }
}
Thomas Levesque
Wow! Did you write that or use LinqPad? It works although I dont fully understand it!
Jon
I used LinqPad at first, then debugged with VS ;). Basically, to merge an element, I merge all their attributes, then all their child elements. I merge child elements with the same name recursively
Thomas Levesque
Thanks, may need to look into LinqPad. Luckily I dont have attributes so dont need to check for them or concat them with elements so I think I can take that out. Thanks for the help.
Jon
Out of interest does LinqPad have some sort of merge functionality so you gave the 2 xml files and it wrote the code?
Jon
No, absolutely not... LinqPad doesn't generate code, it's just a tool to do quick tests with without creating a full Visual Studio project
Thomas Levesque
Don't forget to accept the answer if it solves your problem ;)
Thomas Levesque
You're a genius then!
Jon
After I do a merge I want to serialize it. Is it best to doc.ToString and serialize that text or is it more efficient to serialize the XDocument somehow?
Jon
use the Save method of the XDocument object
Thomas Levesque
Its all in memory not files. I have tried MyType Data = (MyType)new XmlSerializer(typeof(MyType)).Deserialize(new StringReader(doc.ToString())); and MyType Data = (MyType)new XmlSerializer(typeof(MyType)).Deserialize(doc.CreateReader());
Jon
Save is not only for saving to a file... You can also save to a TextWriter or XmlWriter, which in turn can write to any stream (MemoryStream for instance)
Thomas Levesque
Oh ok, I just want to know the optimal way of putting the XDocument into my DeSerialize method.
Jon
You don't need to use XmlSerializer for a XDocument, since it already handles its own serialization... To load the document from a string, use XDocument.Parse. To load it from a file or file, TextReader or XmlReader, use XDocument.Load
Thomas Levesque
I think you're getting confused! I am using your above code to merge two XML strings into one and then I am de-serializing it into a class that has properties etc that represent the elements etc
Jon
+1  A: 

Thomas' answer is really great. However although it worked perfectly well on the given XML, I found it had some trouble on XML with attributes (although the code does theoretically deal with it).

However this line would throw an InvalidCastException trying to convert from XAttribute to XNode:

var nodes = attributes.Concat(elements).ToArray();

Nonetheless I found that the following changes worked for me. Instead of

var attributes = e2.Attributes().Union(e1.Attributes(), attrComparer).Cast<XNode>();
...
var nodes = attributes.Concat(elements).ToArray();
...
if (value != null)
    return new XElement(e1.Name, nodes, value);
else
    return new XElement(e1.Name, nodes);

Try this:

var attributes = e2.Attributes().Union(e1.Attributes(), attrComparer);
...
// var nodes = attributes.Concat(elements).ToArray();
...
if (value != null)
    return new XElement(e1.Name, attributes, elements, value);
else
    return new XElement(e1.Name, attributes, elements);

Seems to work for me, though I'm no expert on these matters. This is just an FYI for anyone else who comes across this.

EDIT: In addition, note that doc.Dump() doesn't exist for me and breaks when compiling. I'm using .NET 3.5; perhaps Tom's answer depended on a different version (3.0?), and that might also account for the error messages I got?

Gavin Schultz-Ohkubo