views:

568

answers:

5

Hi all,

I'm trying to write an extension method in .NET that will operate on a generic collection, and remove all items from the collection that match a given criteria.

This was my first attempt:

public static void RemoveWhere<T>(this ICollection<T> Coll, Func<T, bool> Criteria){
    foreach (T obj in Coll.Where(Criteria))
        Coll.Remove(obj);
}

However this throws an InvalidOperationException, "Collection was modified; enumeration operation may not execute". Which does make sense, so I made a second attempt with a second collection variable to hold the items that need to be removed and iterate through that instead:

public static void RemoveWhere<T>(this ICollection<T> Coll, Func<T, bool> Criteria){
    List<T> forRemoval = Coll.Where(Criteria).ToList();

    foreach (T obj in forRemoval)
        Coll.Remove(obj);
}

This throws the same exception; I'm not sure I really understand why as 'Coll' is no longer the collection being iterated over, so why can't it be modified?

If anyone has any suggestions as to how I can get this to work, or a better way to achieve the same, that'd be great.

Thanks.

+8  A: 

For List<T>, this exists already, as RemoveAll(Predicate<T>). As such, I'd suggest that you keep the name (allowing familiarity, and precedence).

Basically, you can't remove while iterating. There are two common options:

  • use indexer based iteration (for) and removal
  • buffer the items to remove, and remove after the foreach (as you've already done)

So perhaps:

public static void RemoveAll<T>(this IList<T> list, Func<T, bool> predicate) {
    for (int i = 0; i < list.Count; i++) {
        if (predicate(list[i])) {
            list.RemoveAt(i--);
        }
    }
}

This approach (untested, note) has the advantage of avoiding lots of extra copies of the list.

Marc Gravell
Nice solution, thanks!
Lee D
When I remove from a collection using your first common option I usually store Count in a int n variable and let the loop count back back from the end to the beginning, not needing the i--. Saves a Count() call inside the loop. (Premature optimalisation, I know...)
peSHIr
+1  A: 

I just tested it, and your second method works fine (as it should). Something else must be going wrong, can you provide a bit of sample code that shows the problem?

List<int> ints = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

ints.RemoveWhere(i => i > 5);
foreach (int i in ints)
{
    Console.WriteLine(i);
}

Gets:

1
2
3
4
5
Simon Steele
+2  A: 

As Marc said, List<T>.RemoveAll() is the way to go for lists.

I'm surprised your second version didn't work though, given that you've got the call to ToList() after the Where() call. Without the ToList() call it would certainly make sense (because it would be evaluated lazily), but it should be okay as it is. Could you show a short but complete example of this failing?

EDIT: Regarding your comment in the question, I still can't get it to fail. Here's a short but complete example which works:

using System;
using System.Collections.Generic;
using System.Linq;

public class Staff
{
    public int StaffId;
}

public static class Extensions
{
    public static void RemoveWhere<T>(this ICollection<T> Coll,
                                      Func<T, bool> Criteria)
    {
        List<T> forRemoval = Coll.Where(Criteria).ToList();

        foreach (T obj in forRemoval)
        {
            Coll.Remove(obj);
        }
    }
}

class Test
{
    static void Main(string[] args)
    {
        List<Staff> mockStaff = new List<Staff>
        {
            new Staff { StaffId = 3 },
            new Staff { StaffId = 7 }
        };

       Staff newStaff = new Staff{StaffId = 5};
       mockStaff.Add(newStaff);
       mockStaff.RemoveWhere(s => s.StaffId == 5);

       Console.WriteLine(mockStaff.Count);
    }
}

If you could provide a similar complete example which fails, I'm sure we can work out the reason.

Jon Skeet
+1  A: 

Hi,

I just tried your second example and it seems to work fine:

Collection<int> col = new Collection<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
col.RemoveWhere(x => x % 2 != 0);

foreach (var x in col)
    Console.WriteLine(x);
Console.ReadLine();

I didn't get an exception.

Dustin Campbell
A: 

Another version of Marcs RemoveAll:

public static void RemoveAll<T>(this IList<T> list, Func<T, bool> predicate)
{
    int count = list.Count;
    for (int i = count-1; i > -1; i--)
    {
        if (predicate(list[i]))
        {
            list.RemoveAt(i);
        }
    }
}
Bengt
Right. Like I said in comments. Looks like Bengt beat me to even that suggestion. ;-)
peSHIr
A) You shouldn't store the Count outside, the compiler cannot remove the bounds check if you do that.
Samuel
B) This overload is almost useless, you would have to specifically save your predicate as Func<T, bool> before passing it in otherwise it will satisfy the original RemoveAll<T>(Predicate<T>) and not call your extension.
Samuel
A) I know, but can the compiler optimize it even when you remove items in the loop and depend on the count changing?
Bengt
More importantly, since it's an interface you have no idea how the Count is implemented, perhaps it's a very expensive call to some database, so I wouldn't trust the compiler to optimize this for me. I agree with B though.
Bengt