tags:

views:

778

answers:

5

Is there a built in linq method thing I can use to find out if two sequences contains the same items, not taking the order into account?

For example:

{1, 2, 3} == {2, 1, 3}
{1, 2, 3} != {2, 1, 3, 4}
{1, 2, 3} != {1, 2, 4}

You have the SequenceEquals, but then I would have to Order both sequences first, wouldn't I?

A: 

I think ordering the sequence is the fastest way you can achieve this.

Mehrdad Afshari
For SequenceEquals it has to be the same order.
leppie
"The SequenceEqual<(Of <(TSource>)>)(IEnumerable<(Of <(TSource>)>), IEnumerable<(Of <(TSource>)>)) method enumerates the two source sequences in parallel and compares corresponding elements by using the default equality comparer for TSource, Default."
Svish
Yeah, I misread the question. This is O(nlogn) which is the fastest asymptotic time an algorithm can achieve for this purpose.
Mehrdad Afshari
Even if it is two sequences?
Svish
probably is, when I think about it a bit more....
Svish
Theoretically, the fastest algorithm is to sort one and binary search every element of the other in the sorted version. This does require writing code manually. The other fast algorithm is to build a dictionary of one of them and look up elements of the other through it.
Mehrdad Afshari
Except is an O(n) op, where sorting is at least O(n log n)
leppie
Except works like the latter algorhitm you described :)
leppie
+5  A: 

There are quite a few ways. Assume A and B is IEnumerable.

A.Except(B).Count == 0 && B.Except(A).Count == 0
A.Count == B.Count && A.Intersect(B).Count == B.Count
etc
leppie
using Intersect there, do you need to do it both ways?
Svish
I think this is much slower than ordering the sequences. Unless you are dealing with expression trees.
Mehrdad Afshari
@Svish: Yes, you *do* need to do it both ways A = B iff A Δ B = Φ.
Mehrdad Afshari
What does "A = B iff A Δ B = Φ" mean?
Svish
Set theory. A, B are two sets. A Δ B means (A - B) U (B - A).
Mehrdad Afshari
How can it be slower? Except is an O(n) op, where sorting is at least O(n log n).
leppie
@svish: Updated, you just need to make sure the count is the same, then 1 intersect is sufficient. For except, you need both ways, always good to determine differences (added/removed).
leppie
How does Except work if it's only O(n)? In order for it to be O(n) wouldn't the lists need to be sorted?
John
By adding the elements of the first list to a dictionary.
leppie
query.Count() == 0 is the same as !query.Any(), but the any call is faster because it doesn't evaluate the entire list.
Cameron MacFarland
+1  A: 

I did this for merging new items into a collection without duplicates, it takes two collections and returns all the items with out any duplicates

List<Campaign> nonMatching = (from n in newCampaigns 
where !(from e in Existing select e.Id).Contains<int>(n.Id) 
select n).ToList<Campaign>();

Now by removing the ! for the contains statement

List<Campaign> nonMatching = (from n in newCampaigns 
where (from e in Existing select e.Id).Contains<int>(n.Id) 
select n).ToList<Campaign>();

it will return the duplicates

Bob The Janitor
A: 

If you're really just testing to see if there are duplicates, then leppie's suggestion should work:

if (A.Except(B).Count == 0 && B.Except(A).Count == 0) {...}

But if you just need to arrive at an IEnumerable with no duplicates:

var result = A.Union(B).Distinct();
Daniel
+1  A: 

Try the HashSet class:

var enumA = new[] { 1, 2, 3, 4 };
var enumB = new[] { 4, 3, 1, 2 };

var hashSet = new HashSet<int>(enumA);
hashSet.SymmetricExceptWith(enumB);
Console.WriteLine(hashSet.Count == 0); //true => equal

But that does only work correctly if the values are distinct.

For example

var enumA = new[] { 1, 1, 1, 2 };
var enumB = new[] { 1, 2, 2, 2 };

are also considered as "equal" wit the mentioned method.

Rauhotz