views:

970

answers:

4
 class Program
{
    static void Main(string[] args)
    {
        List<Book> books = new List<Book> 
        {
            new Book
            {
                Name="C# in depth",
                Authors = new List<Author>
                {
                    new Author 
                    {
                        FirstName = "Jon", LastName="Skeet"
                    },
                     new Author 
                    {
                        FirstName = "Jon", LastName="Skeet"
                    },                       
                }
            },
            new Book
            {
                Name="LINQ in Action",
                Authors = new List<Author>
                {
                    new Author 
                    {
                        FirstName = "Fabrice", LastName="Marguerie"
                    },
                     new Author 
                    {
                        FirstName = "Steve", LastName="Eichert"
                    },
                     new Author 
                    {
                        FirstName = "Jim", LastName="Wooley"
                    },
                }
            },
        };


        var temp = books.SelectMany(book => book.Authors).Distinct();
        foreach (var author in temp)
        {
            Console.WriteLine(author.FirstName + " " + author.LastName);
        }

        Console.Read();
    }

}
public class Book
{
    public string Name { get; set; }
    public List<Author> Authors { get; set; }
}
public class Author
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public override bool Equals(object obj)
    {
        return true;
        //if (obj.GetType() != typeof(Author)) return false;
        //else return ((Author)obj).FirstName == this.FirstName && ((Author)obj).FirstName == this.LastName;
    }

}

This is based on an example in LINQ in action. Listing 4.16.

This prints Jon Skeet twice. Why? I have even tried overriding Euals method in Author class. Still Distinct does not seem to work. What am I missing?

Edit: I have added == and != operator overload too. Still no help.

 public static bool operator ==(Author a, Author b)
    {
        return true;
    }
    public static bool operator !=(Author a, Author b)
    {
        return false;
    }
+6  A: 

The Distinct() method checks reference equality for reference types. This means it is looking for literally the same object duplicated, not different objects which contain the same values.

There is an overload which takes an IEqualityComparer, so you can specify different logic for determining whether a given object equals another.

If you want Author to normally behave like a normal object (i.e. only reference equality), but for the purposes of Distinct check equality by name values, use an IEqualityComparer. If you always want Author objects to be compared based on the name values, then override GetHashCode and Equals, or implement IEquatable.

The two members on the IEqualityComparer interface are Equals and GetHashCode. Your logic for determining whether two Author objects are equal appears to be if the First and Last name strings are the same.

public class AuthorEquals : IEqualityComparer<Author>
{
    public bool Equals(Author left, Author right)
    {
        if((object)left == null && (object)right == null)
        {
            return true;
        }
        if((object)left == null || (object)right == null)
        {
            return false;
        }
        return left.FirstName == right.FirstName && left.LastName == right.LastName;
    }

    public int GetHashCode(Author author)
    {
        return (author.FirstName + author.LastName).GetHashCode();
    }
}
Rex M
thnx. It also works.
Tanmoy
+1 nice explanation
blu
+6  A: 

LINQ Distinct is not that smart when it comes to custom objects.

All it does is look at your list and see that it has two different objects (it doesn't care that they have the same values for the member fields).

One workaround is to implement the IEquatable interface as shown here.

If you modify your Author class like so it should work.

public class Author : IEquatable<Author>
{
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public bool Equals(Author other)
    {
        if (FirstName == other.FirstName && LastName == other.LastName)
            return true;

        return false;
    }

    public override int GetHashCode()
    {
        int hashFirstName = FirstName == null ? 0 : FirstName.GetHashCode();
        int hashLastName = LastName == null ? 0 : LastName.GetHashCode();

        return hashFirstName ^ hashLastName;
    }
}
skalburgi
thanks. it workes. :)
Tanmoy
IEquatable is fine but incomplete; you should *always* implemement Object.Equals() and Object.GetHashCode() together; IEquatable<T>.Equals does not override Object.Equals, so this will fail when making non-strongly typed comparisons, which occurs often in frameworks and always in non-generic collections.
AndyM
So is it better to use the override of Distinct that takes IEqualityComparer<T> as Rex M has suggested? I mean what I should be doing if I dont want to fall into the trap.
Tanmoy
@Tanmoy it depends. If you want Author to normally behave like a normal object (i.e. only reference equality) but check the name values for the purpose of Distinct, use an IEqualityComparer. If you *always* want Author objects to be compared based on the name values, then override GetHashCode and Equals, or implement IEquatable.
Rex M
+2  A: 

You've overriden Equals(), but make sure you also override GetHashCode()

Eric King
A: 

Distinct() performs the default equality comparison on objects in the enumerable. If you have not overridden Equals() and GetHashCode(), then it uses the default implementation on object, which compares references.

The simple solution is to add a correct implementation of Equals() and GetHashCode() to all classes which participate in the object graph you are comparing (ie Book and Author).

The IEqualityComparer interface is a convenience that allows you to implement Equals() and GetHashCode() in a separate class when you don't have access to the internals of the classes you need to compare, or if you are using a different method of comparison.

AndyM