Hello,
I have a complex LINQ query (using LINQ 2 EF) that can return duplicate results and I'm thus using the .Distinct()
method to avoid duplicates. Here's the skeleton:
var subQuery1 = // one query...
var subQuery2 = // another query...
var result = subQuery1.Distinct().Union( subQuery2.Distinct() ).ToArray();
Each of the sub queries join a common user table with another table and perform a 'where' query, the results are later combined in the .Union(...)
. This worked fine until the table was modified to include an XML column, which results in this exception:
the xml data type cannot be selected as distinct because it is not comparable
In this case I don't care if the XML column is equivalent across the results. actually I only need to be assured that the primary key UserId
is distinct in the results.
Is there a way to use Distinct()
but ignore the XML column or a simpler way to assure that I remove records from the result with the same UserId
in an efficient way? Ideally this would not retrieve duplicate records from the database and would not require post-processing to remove the duplicates.
Update: I've found out that if I serialize my queries to arrays ahead of time then there is no need for any kind of comparer since Linq2Objects doesn't have the XML distinct selection issue. For example I can do this:
var subQuery1 = // one query...
var subQuery2 = // another query...
var result =
subQuery1.Distinct().ToArray().Union(
subQuery2.Distinct().ToArray() )
.ToArray();
So what I'm really looking for is a way to avoid serializing the intermediate queries and do a Linq2Entities call directly that will not fetch records with duplicate UserId
s. Thanks for all the answers thus far.