views:

550

answers:

2

I have an array of arrays - information about selection in Excel using VSTO, where each element means start and end selection position.

For example,

int[][] selection = {
new int[] { 1 }, // column A
new int[] { 6 }, // column F
new int[] { 6 }, // column F
new int[] { 8, 9 } // columns H:I
new int[] { 8, 9 } // columns H:I
new int[] { 12, 15 } // columns L:O
};

Could you please help me to find a way, maybe using LINQ or Extension methods, to remove duplicated elements? I mean: F and F, H:I and H:I, etc.

+2  A: 

If you want to use a pure LINQ/extension method solution, then you'll need to define your own implementation of IEqualityComparer for arrays/sequences. (Unless I'm missing something obvious, there's no pre-existing array or sequence comparer in the BCL). This isn't terribly hard however - here's an example of one that should do the job pretty well:

public class SequenceEqualityComparer<T> : IEqualityComparer<IEnumerable<T>>
{
    public bool Equals(IEnumerable<T> x, IEnumerable<T> y)
    {
        return Enumerable.SequenceEqual(x, y);
    }

    // Probably not the best hash function for an ordered list, but it should do the job in most cases.
    public int GetHashCode(IEnumerable<T> obj)
    {
        int hash = 0;
        int i = 0;
        foreach (var element in obj)
            hash = unchecked((hash * 37 + hash) + (element.GetHashCode() << (i++ % 16)));
        return hash;
    }
}

The advantage of this is that you can then simply call the following to remove any duplicate arrays.

var result = selection.Distinct(new SequenceEqualityComparer<int>()).ToArray();

Hope that helps.

Noldorin
A: 

First you need a way to compare the integer arrays. To use it with the classes in the framework, you do that by making an EquailtyComparer. If the arrays are always sorted, that is rather easy to implement:

public class IntArrayComparer : IEqualityComparer<int[]> {

 public bool Equals(int[] x, int[] y) {
  if (x.Length != y.Length) return false;
  for (int i = 0; i < x.Length; i++) {
   if (x[i] != y[i]) return false;
  }
  return true;
 }

 public int GetHashCode(int[] obj) {
  int code = 0;
  foreach (int value in obj) code ^= value;
  return code;
 }

}

Now you can use an integer array as key in a HashSet to get the unique arrays:

int[][] selection = {
 new int[] { 1 }, // column A
 new int[] { 6 }, // column F
 new int[] { 6 }, // column F
 new int[] { 8, 9 }, // columns H:I
 new int[] { 8, 9 }, // columns H:I
 new int[] { 12, 15 } // columns L:O
};

HashSet<int[]> arrays = new HashSet<int[]>(new IntArrayComparer());
foreach (int[] array in selection) {
 arrays.Add(array);
}

The HashSet just throws away duplicate values, so it now contains four integer arrays.

Guffa