I have a class Agent with a property Id
Given a collection of Agents I need to check if any of them have duplicate Ids.
I am currently doing this with a hash table but am trying to get Linq-ified, what's a good way of doing this?
I have a class Agent with a property Id
Given a collection of Agents I need to check if any of them have duplicate Ids.
I am currently doing this with a hash table but am trying to get Linq-ified, what's a good way of doing this?
foreach(var agent in Agents) {
if(Agents.Count(a => a.ID == agent.ID) > 1)
Console.WriteLine("Found: {0}", agent.ID);
}
bool b = list.Any(i => list.Any(j => j.ID == i.ID && j != i));
That's a bit of a brute-force approach but it works. There might be a smarter way to do it using the Except() extension method.
Edit: You didn't actually say that you needed to know which items are "duplicated", only that you needed to know whether any where. This'll do the same thing except give you a list you can iterate over:
list.Where(i => list.Any(j => j.ID == i.ID && j != i))
I like the grouping approach too (group by ID and find the groups with count > 1).
Similar to Y Low's approach,
Edited:
var duplicates = agents.GroupBy(a => a.ID).Where(a=>a.Count() > 1);
foreach (var agent in duplicates)
{
Console.WriteLine(agent.Key.ToString());
}
My take (no counting!):
var duplicates = agents
.GroupBy(a => a.ID)
.Where(g => g.Skip(1).Any());
For what it's worth, I just compared the two methods we've struck upon in this thread. First I defined a helper class:
public class Foo
{
public int ID;
}
... and then made a big list of instances with a random ID:
var list = new List<Foo>();
var r = new Random();
for (int i = 0; i < 10000; i++) list.Add(new Foo { ID = r.Next() });
... and lastly, timed the code:
var sw = new Stopwatch();
sw.Start();
bool b = list.Any(i => list.Where(j => i != j).Any(j => j.ID == i.ID));
Console.WriteLine(b);
Console.WriteLine(sw.ElapsedTicks);
sw.Reset();
sw.Start();
b = (list.GroupBy(i => i.ID).Count() != list.Count);
Console.WriteLine(b);
Console.WriteLine(sw.ElapsedTicks);
Here's one output:
False
59392129
False
168151
So I think it's safe to say that grouping and then comparing the count of groups to the count of items is way, way faster than doing a brute-force "nested Any" comparison.