views:

79

answers:

4

When comparing two key-value dictionary sets in C#: set A and set B, what is the best way to enumerate keys present in set A but missing from set B and vice-versa?

For example:

A = { 1, 2, 5 }
B = { 2, 3, 5 }

Comparing B with A, missing keys = { 1 } and new keys = { 3 }.

Using Dictionary<...,...> objects, one can enumerating all values in B and test against set A using A.ContainsKey(key);, but it feels like there should be a better way that might involve a sorted set?

+3  A: 

I'm aware of two built-in ways of doing set differences.

1) Enumerable.Except

Produces the set difference of two sequences by using the default equality comparer to compare values.

Example:

IEnumerable<int> a = new int[] { 1, 2, 5 };
IEnumerable<int> b = new int[] { 2, 3, 5 };

foreach (int x in a.Except(b))
{
    Console.WriteLine(x);  // prints "1"
}

2a) HashSet<T>.ExceptWith

Removes all elements in the specified collection from the current HashSet<T> object.

HashSet<int> a = new HashSet<int> { 1, 2, 5 };
HashSet<int> b = new HashSet<int> { 2, 3, 5 };

a.ExceptWith(b);

foreach (int x in a)
{
    Console.WriteLine(x);  // prints "1"
}

2b) HashSet<T>.SymmetricExceptWith

Modifies the current HashSet<T> object to contain only elements that are present either in that object or in the specified collection, but not both.

HashSet<int> a = new HashSet<int> { 1, 2, 5 };
HashSet<int> b = new HashSet<int> { 2, 3, 5 };

a.SymmetricExceptWith(b);

foreach (int x in a)
{
    Console.WriteLine(x);  // prints "1" and "3"
}

If you need something more performant, you'll probably need to roll your own collection type.

dtb
+1 for suggesting the correct data type: `HashSet<T>`.
Steve Guidi
@Steve Guidi - how would you use HashSet<T> for key-value data?
Steve Townsend
A: 

you can use the Except method.

Dictionary<string, string> dic1 = new Dictionary<string, string>() { { "rabbit", "hat" }, { "frog", "pond" }, { "cat", "house" } };
Dictionary<string, string> dic2 = new Dictionary<string, string>() { { "rabbit", "hat" }, { "dog", "house"}, {"cat", "garden"}};

    var uniqueKeys = dic1.Keys.Except(dic2.Keys);

    foreach (var item in uniqueKeys)
    {
        Console.WriteLine(item);
    }
BrokenGlass
OP wants [`A\B`](http://en.wikipedia.org/wiki/Complement_(set_theory\)) and [`B\A`](http://en.wikipedia.org/wiki/Complement_(set_theory\)), or maybe [`A△B`](http://en.wikipedia.org/wiki/Symmetric_difference), but not [`A∩B`](http://en.wikipedia.org/wiki/Intersection_(set_theory\)).
dtb
ahh I misunderstood, in that case Except would work, fixing.
BrokenGlass
A: 

So there are several answers here that will work. But your original question is best addressed in two parts:

Q) When comparing two key-value dictionary sets in C#: set A and set B, what is the best way to enumerate keys present in set A but missing from set B and vice-versa? Using Dictionary<...,...> objects, one can enumerating all values in B and test against set A using A.ContainsKey(key);, ...

If your starting with two dictionaries this may be the best approach. To do anything else would require creating a copy of the keys from both sets, thus making most alternatives more expensive.

Q) ...but it feels like there should be a better way that might involve a sorted set?

Yes this can be done with a sorted list easily enough. Create two List insertion sorted with BinarySearch, then walk set 1 while searching set 2 ect.

See this SetList Complement and Subtract operations: http://code.google.com/p/csharptest-net/source/browse/trunk/src/Library/Collections/SetList.cs#234

csharptest.net
+1  A: 

Use SortedDictionary : logic is A.Except(A.Intersect(B)).

Don't worry overmuch about performance until you've determined it is an issue to your datasets.

Steve Townsend
+1 easier to optimise a correct program than to correct an optimised program :)
FreshCode