views:

296

answers:

6

I have an array of Customer[] objects, and I want to use it to create a Dictionary<Customer, string>. What is the easiest way to examine the array for duplicates before I load the Dictionary? I want to avoid "ArgumentException: An item with the same key has already been added". Thanks.

+6  A: 

Just call Dictionary.ContainsKey(key) before you add your Customers.

Jan Bannister
+5  A: 

You could use LINQ to do both:

Customer[] customers; // initialized somehow...
var customerDictionary = customers.Distinct().ToDictionary( cust => cust.SomeKey );

If you will build the dictionary in a less straightforward fashion, you can just use the Distinct() extension method to get a unique array like so:

Customer[] uniqueCustomers = customers.Distinct().ToArray();

If you need to be aware of potential duplicates, you could use GroupBy( c => c ) first to identify which items have duplicates.

Finally, if you don't want to use LINQ, you can build the dictionary on the fly and use a precondition check when adding each item:

var customerDictionary = new Dictionary<Customer,string>();
foreach( var cust in customers )
{
    if( !customerDictionary.ContainsKey(cust) )
        customerDictionary.Add( cust, cust.SomeKey ); 
}
LBushkin
I like the look of this approach, and I'm reluctant to talk about performance in this case but wouldn't calling distinct on the array involve doing a lot of comparisons? Dictionary.ContainsKey is relatively O(1).
Josh Smeaton
@Josh: My understanding is that LINQ's Distinct() operator internally builds a hashset structure to optimize its performance. So it should perform better than just iteratively searching through a list for duplicates. Read this SO question for more: http://stackoverflow.com/questions/146358/efficiently-merge-string-arrays-in-net-keeping-distinct-values
LBushkin
+2  A: 

How big is the array? and how likely is it that there will be duplicates?

Checking each element of the array against all the others is quite a expensive operation.

It would be quicker to call Dictionary.ContainsKey(key) before adding each item.

NOTE: If duplicates are rare then you could use exception handling, but that's bad programming practice.

ChrisF
Small arrays. This was the direction I was headed, but I would like to know up front if dupes were passed in before I begin processing. The string in Dictionary<Customer, string> is some response XML from a web service associated with the Customer object.
Jack T. Colton
Using exception handling for flow control is not a desirable practice.
LBushkin
I agree, LBushkin.
Jack T. Colton
Bad practice to use exceptions for process flow!
Tor Haugen
A: 

Why not this??

Customers.Distinct.ToDictionary(o=>o, GenerateString(o));
Jason Punyon
+1  A: 

What is your definition of duplicate in this case?

If its simply the same object instance (the same pointer) then that's simple, you can use any of the methods in the other answers given here.

Sometimes though the concept of equality is not so straight forward, is a different object instance with the same data equal? In that case you probably want an implementation of an IEqualityComparer to help you.

Tim Jarvis
Let's say two Customer objects are dupes if they have the same SSN.
Jack T. Colton
Ah, well in this case you will need to specify the Equality, as a simple comparison of the pointer is not going to tell you that.
Tim Jarvis
so, you can use the SSD as a key in a dictionary, or for a more complete solution you can implement a IEqualityComparer<T> that you can use in a bunch of linq extension methods.
Tim Jarvis
+2  A: 

The most efficient way of doing that, from BOTH PERFORMANCE and CODE points of view, is this:

dict[key] = value

This way the exception mentioned by you will never get thrown, and the key lookup will not happen twice

Alexander
+1 because he didn't say the value was unique for each object.
Will