views:

236

answers:

10

I have a list of User objects, and I have to remove ONE item from the list with a specific UserID.

This method has to be as fast as possible, currently I am looping through each item and checking if the ID matches the UserID, if not, then I add the row to a my filteredList collection.

List allItems = GetItems();

for(int x = 0; x < allItems.Count; x++)
{
    if(specialUserID == allItems[x].ID)
        continue;
    else
        filteredItems.Add( allItems[x] );
}
+6  A: 

If it really has to be as fast as possible, use a different data structure. List isn't known for efficiency of deletion. How about a Dictionary that maps ID to User?

Ken
+1  A: 

I like Ken's idea. Sparing that, consider RemoveAll.

lance
+1  A: 

I know it's not the fastest, but what about generic list and remove()? (msdn). Anybody knows how it performs compared to eg. the example in the question?

Ezombort
Remove() requires that you have the object you want to remove. If the question is to be believed, he has the ID of the user he wants removed, but not the object of type (well, presumably of type) "User" that he actually wants removed from the list.
lance
Yes, a generic list would be faster.
Gary Willoughby
@Ezombort, @Gary: I'm guessing that the OP is already using `List<T>` despite there being no `<T>` in the declaration code in the question. If they were using `ArrayList` or some other non-generic collection then they'd need to cast each element to a specific type before checking its `ID` property, and they're not!
LukeH
@Luke: Ah, that's true. And the for loop is much faster than foreach isn't it?
Ezombort
+2  A: 

Use a hashtable. Lookup time is O(1) for everything assuming a good hash algorithm with minimal collision potential. I would recommend something that implements IDictionary

Jeff Tucker
+4  A: 

Well, if you want to create a new collection to leave the original untouched, you have to loop through all the items.

Create the new list with the right capacity from the start, that minimises allocations.

Your program logic with the continue seems a bit backwards... just use the != operator instead of the == operator:

List<User> allItems = GetItems();

List<User> filteredItems = new List<User>(allItems.Count - 1);

foreach (User u in allItems) {
   if(u.ID != specialUserID) {
      filteredItems.Add(u);
   }
}

If you want to change the original collection instead of creating a new, storing the items in a Dictionary<int, User> would be the fastest option. Both locating the item and removing it are close to O(1) operations, so that would make the whole operation close to an O(1) operation instead of an O(n) operation.

Guffa
+1 for recommendation of dictionary and creating a new list instead of removing from existing. clever.
Steve
+1  A: 

Here's a thought, how about you don't remove it per se. What I mean is something like this:

public static IEnumerable<T> LoopWithExclusion<T>(this IEnumerable<T> list, Func<T,bool> excludePredicate)
{
   foreach(var item in list)
   {
      if(excludePredicate(item))
      {
         continue;
      }

      yield return item;
   }
}

The point being, whenever you need a "filtered" list, just call this extension method, which loops through the original list, returns all of the items, EXCEPT the ones you don't want.

Something like this:

List<User> users = GetUsers();

//later in the code when you need the filtered list:

foreach(var user in users.LoopWithExclusion(u => u.Id == myIdToExclude))
{
   //do what you gotta do
}
BFree
+1  A: 

Assuming the count of the list is even, I would :

(a) get a list of the number of processors

(b) Divide your list into equal chunks for each processors

(c) spawn a thread for each processor with these data chunks, with the terminating condition being if the predicate is found to return a boolean flag.

Darknight
This is exactly what the Parallel.For does in .net 4.0. Using this would save you a lot of hassle.
Mez
Mez, thanks didn't know about that!
David
All very well **if** you have .net 4.0!
Darknight
A: 

Here is some code that is efficient if you have hundreds or thousands of items:

List allItems = GetItems();
//Choose the correct loop here

if((x % 5) == 0 && (X >= 5))
{
     for(int x = 0; x < allItems.Count; x = x + 5)
     {
         if(specialUserID != allItems[x].ID)
             filteredItems.Add( allItems[x] );
         if(specialUserID != allItems[x+1].ID)
             filteredItems.Add( allItems[x+1] );
         if(specialUserID != allItems[x+2].ID)
             filteredItems.Add( allItems[x+2] );
         if(specialUserID != allItems[x+3].ID)
             filteredItems.Add( allItems[x+3] );
         if(specialUserID != allItems[x+4].ID)
             filteredItems.Add( allItems[x+4] );
      }
 }

Start testing if the size of the loop is divisible by the largest number to the smallest number. if you want 10 if statements in the loop then test if the size of the list is bigger then ten and divisible by ten then go down from there. For example if you have 99 items --- you can use 9 if statements in the loop. The loop will iterate 11 times instead of 99 times

"if" statements are cheap and fast

Can you explain this one some more?
David
A: 

If you must transfer from one list to another here is the fasted result I've found:

        var filtered = new List<SomeClass>(allItems);
        for (int i = 0; i < filtered.Count; i++)
            if (filtered[i].id == 9999)
                filtered.RemoveAt(i);

I tried comparing your method, the method above, and a linq "where" statement:

            var allItems = new List<SomeClass>();
        for (int i = 0; i < 10000000; i++)
            allItems.Add(new SomeClass() { id = i });

        Console.WriteLine("Tests Started");
        var timer = new Stopwatch();

        timer.Start();
        var filtered = new List<SomeClass>();
        foreach (var item in allItems)
            if (item.id != 9999)
                filtered.Add(item);
        var y = filtered.Last();
        timer.Stop();
        Console.WriteLine("Transfer to filtered list: {0}", timer.Elapsed.TotalMilliseconds);

        timer.Reset();
        timer.Start();
        filtered = new List<SomeClass>(allItems);
        for (int i = 0; i < filtered.Count; i++)
            if (filtered[i].id == 9999)
                filtered.RemoveAt(i);
        var s = filtered.Last();
        timer.Stop();
        Console.WriteLine("Removal from filtered list: {0}", timer.Elapsed.TotalMilliseconds);

        timer.Reset();
        timer.Start();
        var linqresults = allItems.Where(x => (x.id != 9999));
        var m = linqresults.Last();
        timer.Stop();
        Console.WriteLine("linq list: {0}", timer.Elapsed.TotalMilliseconds);

The results were as follows: Tests Started

Transfer to filtered list: 610.5473

Removal from filtered list: 207.5675

linq list: 379.4382

using the "Add(someCollection)" and using a ".RemoveAt" was a good deal faster.

Also, subsequent .RemoveAt calls are pretty cheap.

DataDink
A: 
public static void RemoveSingle<T>(this List<T> items, Predicate<T> match)
{
    int i = -1;
    while (i < items.Count && !match(items[++i])) ;
    if (i < items.Count)
    {
        items[i] = items[items.Count - 1];
        items.RemoveAt(items.Count - 1);
    }
}
Hasan Khan