views:

644

answers:

3

Does anyone have a good and efficient extension method for finding if a sequence of items has any duplicates?

Guess I could put return subjects.Distinct().Count() == subjects.Count() into an extension method, but kind of feels that there should be a better way. That method would have to count elements twice and sort out all the distict elements. A better implementation should return true on the first duplicate it finds. Any good suggestions?

I imagine the outline could be something like this:

public static bool HasDuplicates<T>(this IEnumerable<T> subjects)
{
    return subjects.HasDuplicates(EqualityComparer<T>.Default);
}

public static bool HasDuplicates<T>(this IEnumerable<T> subjects, IEqualityComparer<T> comparer)
{
    ...
}

But not quite sure how a smart implementation of it would be...

+13  A: 
public static bool HasDuplicates<T>(this IEnumerable<T> subjects)
{
    return HasDuplicates(subjects, EqualityComparer<T>.Default);
}

public static bool HasDuplicates<T>(this IEnumerable<T> subjects, IEqualityComparer<T> comparer)
{
    HashSet<T> set = new HashSet<T>(comparer);
    foreach (T item in subjects)
    {
        if (!set.Add(item))
            return true;
    }

    return false;
}
280Z28
Now see *that* is pretty clever... will have to try that one out!
Svish
This could be handy... I don't think HashSet is supported in the compact framework though... grrrrr.
Jason Down
You can do the same thing with `Dictionary<T,K>`, but it's a bit less memory efficient. `Dictionary<T,K>.Add` doesn't return a `bool`, but its `Count` property would be trivially fast to check at the expense of a couple more lines of code.
280Z28
+1  A: 

I think the simplest extension method is the following.

public static bool HasDuplicates<T>(this IEnumerable<T> enumerable) {
  var hs = new HashSet<T>();
  foreach ( var cur in enumerable ) {
    if ( !hs.Add(cur) ) {
      return false;
    }
  }
}
JaredPar
Think that one is missing a return statement, as well returning the oposite of what it should? Anyways, very similar to the other answer and you have enough rep so accepted the other answer ;p
Svish
+4  A: 

This is in production code. Works great:

public static bool HasDuplicates<T>(this IEnumerable<T> sequence) {
    var set = new HashSet<T>();
    return !sequence.All(item => set.Add(item));
}
Rich Armstrong
Looks to be the pretty much the same as my accepted answer, except quite a bit shorter. Interesting :)
Svish
Yes, same approach as the first answer (from 280Z28); didn't mean to suggest otherwise.
Rich Armstrong