views:

34

answers:

2

I have a list testList that contains a bunch of strings. I would like to add a new string into the testList only if it doesn't already exist in the list. Therefore, I need to do a case-insensitive search of the list and make it efficient. I can't use Contains because that doesn't take into account the casing. I also don't want to use ToUpper/ToLower for performance reasons. I came across this method, which works:

    if(testList.FindAll(x => x.IndexOf(keyword, 
                       StringComparison.OrdinalIgnoreCase) >= 0).Count > 0)
       Console.WriteLine("Found in list");

This works, but it also matches partial words. If the list contains "goat", I can't add "oat" because it claims that "oat" is already in the list. Is there a way to efficiently search lists in a case insensitive manner, where words have to match exactly? thanks

+4  A: 

Instead of String.IndexOf, use String.Equals to ensure you don't have partial matches. Also don't use FindAll as that goes through every element, use FindIndex (it stops on the first one it hits).

if(testList.FindIndex(x => x.Equals(keyword,  
    StringComparison.OrdinalIgnoreCase) ) != -1) 
    Console.WriteLine("Found in list"); 

Alternately use some LINQ methods (which also stops on the first one it hits)

if( testList.Any( s => s.Equals(keyword, StringComparison.OrdinalIgnoreCase) ) )
    Console.WriteLine("found in list");
Adam Sills
Perfect, thank you! (Have to wait a few mins to accept answer)
Brap
Just to add, in a few quick tests, it seems that the first method is around 50% faster. Maybe someone else can confirm/deny that.
Brap
A: 

You're checking if the result of IndexOf is larger or equal 0, meaning whether the match starts anywhere in the string. Try checking if it's equal to 0:

if (testList.FindAll(x => x.IndexOf(keyword, 
                   StringComparison.OrdinalIgnoreCase) >= 0).Count > 0)
   Console.WriteLine("Found in list");

Now "goat" and "oat" won't match, but "goat" and "goa" will. To avoid this, you can compare the lenghts of the two strings.

To avoid all this complication, you can use a dictionary instead of a list. They key would be the lowercase string, and the value would be the real string. This way, performance isn't hurt because you don't have to use ToLower for each comparison, but you can still use Contains.

Ilya Kogan