views:

763

answers:

7

If I want to have a case-insensitive string-keyed dictionary, which version of StringComparer should I use given these constraints:

  • The keys in the dictionary come from either C# code or config files written in english locale only (either US, or UK)
  • The software is internationalized and will run in different locales

I normally use StringComparer.InvariantCultureIgnoreCase but wasn't sure if that is the correct case. Here is example code:

Dictionary< string, object> stuff = new Dictionary< string, object>(StringComparer.InvariantCultureIgnoreCase);
+1  A: 

Since the keys are your known fixed values, then either InvariantCultureIgnoreCase or OrdinalIgnoreCase should work. Avoid the culture-specific one, or you could hit some of the more "fun" things like the "Turkish i" problem. Obviously, you'd use a cultured comparer if you were comparing cultured values... but it sounds like you aren't.

Marc Gravell
A: 

The concept of "case insensitive" is a linguistic one, and so it doesn't make sense without a culture.

See this blog for more information.

That said if you are just talking about strings using the latin alphabet then you will probably get away with the InvariantCulture.

It is probably best to create the dictionary with StringComparer.CurrentCulture, though. This will allow "ß" to match "ss" in your dictionary under a German culture, for example.

Oliver Hallam
A: 

StringComparer.OrdinalIgnoreCase is slightly faster than InvariantCultureIgnoreCase FWIW ("An ordinal comparison is fast, but culture-insensitive" according to MSDN.

You'd have to be doing a lot of comparisons to notice the difference of course.

Joe
A: 

The Invariant Culture exists specifically to deal with strings that are internal to the program and have nothing to do with user data or UI. It sounds like this is the case for this situation.

Michael Burr
+5  A: 

There are three kinds of comparers:

  • Culture-aware
  • Culture invariant
  • Ordinal

Each comparer has a case-sensitive as well as a case-insensitive version.

An ordinal comparer uses ordinal values of characters. This is the fastest comparer, it should be used for internal purposes.

A culture-aware comparer considers aspects that are specific to the culture of the current thread. It knows the "Turkish i", "Spanish LL", etc. problems. It should be used for UI strings.

The culture invariant comparer is actually not defined and can produce unpredictable results, and thus should never be used at all.

References

  1. New Recommendations for Using Strings in Microsoft .NET 2.0
Michael Damatov
+3  A: 

This MSDN article covers everything you could possibly want to know in great depth, including the Turkish-I problem.

It's been a while since I read it, so I'm off to do so again. See you in an hour!

Greg Beech
A: 

System.Collections.Specialized includes StringDictionary. The Remarks section of the MSDN states "A key cannot be null, but a value can.

The key is handled in a case-insensitive manner; it is translated to lowercase before it is used with the string dictionary.

In .NET Framework version 1.0, this class uses culture-sensitive string comparisons. However, in .NET Framework version 1.1 and later, this class uses CultureInfo.InvariantCulture when comparing strings. For more information about how culture affects comparisons and sorting, see Comparing and Sorting Data for a Specific Culture and Performing Culture-Insensitive String Operations.

GregUzelac