views:

38

answers:

1

This will probably be an extremely simple question. I'm simply trying to remove duplicate byte[]s from a collection.

Since the default behaviour is to compare references, I tought that creating an IEqualityComparer would work, but it doesn't.

I've tried using a HashSet and LINQ's Distinct().

Sample code:

using System;
using System.Collections.Generic;
using System.Linq;

namespace cstest
{
    class Program
    {
        static void Main(string[] args)
        {
            var l = new List<byte[]>();
            l.Add(new byte[] { 5, 6, 7 });
            l.Add(new byte[] { 5, 6, 7 });
            Console.WriteLine(l.Distinct(new ByteArrayEqualityComparer()).Count());
            Console.ReadKey();
        }
    }

    class ByteArrayEqualityComparer : IEqualityComparer<byte[]>
    {
        public bool Equals(byte[] x, byte[] y)
        {
            return x.SequenceEqual(y);
        }

        public int GetHashCode(byte[] obj)
        {
            return obj.GetHashCode();
        }
    }
}

Output:

2
+3  A: 

The GetHashCode will be used by Distinct, and won't work "as is"; try something like:

int result = 13 * obj.Length;
for(int i = 0 ; i < obj.Length ; i++) {
    result = (17 * result) + obj[i];
}
return result;

which should provide the necessary equality conditions for hash-codes.

Personally, I would also unroll the equality test for performance:

if(ReferenceEquals(x,y)) return true;
if(x == null || y == null) return false;
if(x.Length != y.Length) return false;
for(int i = 0 ; i < x.Length; i++) {
    if(x[i] != y[i]) return false;
}
return true;
Marc Gravell
Thanks, it works. Looks like Array.GetHashCode() does not depend on the contents of the array (I get different results every time I launch the app).
GameZelda
@GameZelda - indeed; arrays simply use reference equality until you supply your own comparer.
Marc Gravell