tags:

views:

195

answers:

4

How to remove duplicates from a StringCollection in c#? I was looking for a more efficient approach. StringCollection is returned from an API.

+8  A: 

Just use a HashSet<string> as your collection, rather than StringCollection. It is designed to prevent the addition of duplicate elements by comparing hash codes of those elements (thus being very efficient).

Edit: Since it would seem you're returned a StringCollection in the first place, then the solution should just be to loop over all the items in the StringCollection and add them to a HashSet<string>, thereby eliminating duplicates. The Enumerable.Distinct extension method would also do the job, but less efficiently I suspect, since it does use hashing (rather just normal equality testing). Something like this:

var noDuplicatesItems = stringCollection.Cast<string>().Distinct().ToArray();
Noldorin
The OP did say the StringCollection was coming from an API.
Rowland Shaw
@Rowland: Thanks, but really worth the down-vote? My answer still largely applies.
Noldorin
Good solution but... he could use it only with the framework 3.5 or above.
Claudio Redi
@Claudio: Yes, true... however with .NET 4.0 out now, it is likely most people are at least using .NET 3.5 for new programs. It wouldn't be too hard to implement your own `HashSet<T>` also.
Noldorin
@Noldorin `StringCollection` doesn't appear to support `IEnumerable<string>`, so `Enumerable.Distinct()` wouldn't be available?
Rowland Shaw
@Rowland: It supports `IEnumerable`. So `Cast<string>().Distinct()` will work.
Jeff Yates
@Rowland: See my response on the other post.
Noldorin
@Jeff that cast wasn't there when I wrote my comment...
Rowland Shaw
I have used `HashSet<string> tt = new HashSet<string>(possibleEntities.Cast<string>().ToList() );`How Should I convert back to StringCollection?
Taz
@Rowland: It wasn't there when I wrote mine either. :)
Jeff Yates
@Taz: Do you really need to convert back to `StringCollection`? If so, simply initialise a new instance and call `newStringCollection.AddRange(tt)`.
Noldorin
@Noldorin Thanks. Actually I have already written some code and then realized I need to remove duplicates. So that is why I have to convert back.
Taz
@Noldorin Actually it works like this `newStringCollection.AddRange(tt.Cast<string>().ToArray());`
Taz
@Tax: Ah, glad it's solved. I'm pretty sure you don't need the `ToArray()` call still (that just adds an extra iteration over the collection), but whatever does the job.
Noldorin
A: 

using linq: myCollection.Cast<string>.Distinct().ToList(); or you can use a HashSet as Noldorin proposed

PierrOz
No, because it's not generic. A simple `Cast<string>` will convert it into a generic `IEnumerable<string>` however, then you can use whatever you wish.
Noldorin
Yes you're right, I have edited my answer !! thx
PierrOz
+1  A: 
    StringCollection s = new StringCollection();
    s.Add("s");
    s.Add("s");
    s.Add("t");

    var uniques = s.Cast<IEnumerable>();
    var unique = uniques.Distinct();

    foreach (var x in unique)
    {
        Console.WriteLine(x);
    }

    Console.WriteLine("Done");
    Console.Read();

Not tested for efficiency.

RandomNoob
Does the job, though I'm not a fan of the lack of generics here.
Noldorin
Surely that should be `var uniques = s.Cast<string>();`?`
Rowland Shaw
Try compiling it and running it.
RandomNoob
+1  A: 

If you're in v3.5 of the Framework (or later), then you can first convert to an IEnumerable<string>, and then call the Distinct() method on that; ie:

// where foo is your .Collections.Specialized.StringCollection
IEnumerable<string> distinctList = foo.OfType<string>.Distinct()
Rowland Shaw