I see that over on this question http://stackoverflow.com/questions/312024/linqy-way-to-check-if-any-objects-in-a-collection-have-the-same-property-value there is a request to say how to do something using LINQ to see if a property matches in a collection. However, is this the fastest reasonable process by which to do this? I will be deploying something that requires a certain amount of resource management, and I want the application to be as responsive as can be, without making the code terribly hard to decipher when someone else, or myself come back to it later.
LINQ is almost never the fastest way (in terms of raw execution time) to do anything.
It's usually "fast enough," though. When you have a working app with unit tests, you can profile it to see if you need to optimize.
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." -Donald Knuth
Indeed, LINQ will work fine. Of course, if you know you can optimise the situation in a given case, you can always write your own LINQ extension method for the more-specific type. Since the type is more sepcific, your own method should be used in preference to the default Enumerable
one. Which is nice ;-p
However, is this the fastest reasonable process by which to do this?
I'd guess that a fast way (perhaps the fastest way) to do it might be to add all the objects into a Dictionary or a HashSet, using the property as the key field: a method like HashSet.Add has a return code which tells you whether this property value has already been added. For example:
static bool containsDuplicate(Container<Foo> fooCollection)
{
//create the hash set
HashSet<Bar> hashSet = new HashSet<Bar>();
//for each object to be tested
foreach (Foo foo in fooCollection)
{
//get the interesting object property
Bar propertyValue = fooCollection.bar;
//see whather we've already seen this property value
if (!hashSet.Add(propertyValue))
{
//duplicate detected
return true;
}
}
//no duplicate detected
return false;
}
It really depends on the amount of data in your collection and the frequency in which this operation is performed. A Linq search by property will have to read through every item/property in the collection.
If your collection will only ever have 10 items in it and this operation is only done once a second, then a forward only scan to find an item by a property is pretty likely to be fast enough.
If you have 10 million items in your collection then a forward only or need to perform this kind of operation 100 times second then you will probably need some index on this property.
If it turns out you need to index this I would suggest encapsulating this logic in an object. So for example adding an Item will add it to the main collection and add a property indexer in a hash set of sorts.