views:

1327

answers:

4

I need to union two sets of XElements into a single, unique set of elements. Using the .Union() extension method, I just get a "union all" instead of a union. Am I missing something?

var elements = xDocument.Descendants(w + "sdt")
                   .Union(otherDocument.Descendants(w + "sdt")
                   .Select(sdt =>
                       new XElement(
                           sdt.Element(w + "sdtPr")
                               .Element(w + "tag")
                               .Attribute(w + "val").Value,
                           GetTextFromContentControl(sdt).Trim())
                   )
               );
+1  A: 

It's really hard to troubleshoot your "left join" observation without seeing what it is you are using to come to that conclusion. Here's my shot in the dark.

XDocument doc1 = XDocument.Parse(@"<XML><A/><C/></XML>");
XDocument doc2 = XDocument.Parse(@"<XML><B/><C/></XML>");
//
var query1 = doc1.Descendants().Union(doc2.Descendants());
Console.WriteLine(query1.Count());
foreach (XElement e in query1) Console.WriteLine("--{0}",e.Name);

6
--XML
--A
--C
--XML
--B
--C
//
var query2 = doc1.Descendants().Concat(doc2.Descendants())
  .GroupBy(x => x.Name)
  .Select(g => g.First());
Console.WriteLine(query2.Count());
foreach (XElement e in query2) Console.WriteLine("--{0}", e.Name);

4
--XML
--A
--C
--B

In linq to objects (which is what linq to xml really is), Union against reference types uses reference equality to test for duplicates. XElement is a reference type.

David B
Turns out I was wrong about my "left join" issue. That was my fault. However, the Union operator returns a "union all" (edited my original question to reflect this). I'll try your solution.
Ryan Riley
This isn't working for me as the difference in the elements is the attribute value of grandchild elements (see my response below).
Ryan Riley
A: 

I was able to get the following to work, but it is quite ugly:

var elements = xDocument.Descendants(w + "sdt")
                   .Concat(otherDocument.Descendants(w + "sdt")
                               .Where(e => !xDocument.Descendants(w + "sdt")
                                               .Any(x => x.Element(w + "sdtPr")
                                                             .Element(w + "tag")
                                                             .Attribute(w + "val").Value ==
                                                         e.Element(w + "sdtPr")
                                                             .Element(w + "tag")
                                                             .Attribute(w + "val").Value)))
                   .Select(sdt =>
                       new XElement(
                           sdt.Element(w + "sdtPr")
                               .Element(w + "tag")
                               .Attribute(w + "val").Value,
                           GetTextFromContentControl(sdt).Trim())
                   )
               );

Surely there must be a better way.

Ryan Riley
A: 

What about something like this?

var xDoc = from f in xDocument.Descendants(w + "sdt")
 select new {xNode = f, MatchOn = f.Element(w + "sdtPr").Element(w + "tag").Attribute(w + "val").Value };

var oDoc = from o in otherDocument.Descendants(w + "sdt")
 select new {MatchOn = o.Element(w + "sdtPr").Element(w + "tag").Attribute(w + "val").Value };

var elements = from x in xDoc.Where(f => !oDoc.Any(o => o.MatchOn == f.MatchOn))
 select new XElement(x.MatchOn, GetTextFromContentControl(x.xNode).Trim());
Dave Markle
+3  A: 

Your first impulse was almost correct.:) As per David B, if you do not tell LINQ exactly how you define equality and then give it a bunch of XElements, it will compare them by reference. Fortunately, you can tell it to use different criteria by specifying an IEqualityComparer‹XElement› (basically, an object that has an Equals method that returns true iff two XElements are equal according to your definition and false otherwise and a GetHashCode method that takes an XElement and returns a hash code based on your equality criteria).

For example:

var elements = xDocument.Descendants(w + "sdt")
               .Union(otherDocument.Descendants(w + "sdt", new XElementComparer())
               .RestOfYourCode

...

Somewhere else in your project

public class XElementComparer : IEqualityComparer‹XElement› {
   public bool Equals(XElement x, XElement y) {
     return ‹X and Y are equal according to your standards›;
}


 public int GetHashCode(XElement obj) {
     return ‹hash code based on whatever parameters you used to determine        
            Equals. For example, if you determine equality based on the ID 
            attribute, return the hash code of the ID attribute.›;

 }

 }

Note: I do not have the framework at home, so the exact code is not tested and the IEqualityComparer code is from here (scroll down to second post).

Ria
This was perfect. Thanks!
Ryan Riley