views:

85

answers:

4

Disclaimer: Maybe be micro-YAGNI-optimizing but hear me out ..

The problem is to implement a case-insensitive lookup table.

  • My old-skool way: While populating the dictionary, upper-case the key before inserting. Upper-case the key when someone gives you a key to lookup.
  • The new way (I learned about it today): Dictionary takes in an IComparer implementation, so I could pass in StringComparer.InvariantCultureIgnoreCase. I think it would delegate to String.Compare(x, y, SomeIgnoreCaseEnum)

The new way has an advantage in that I don't need to ensure that a .ToUpper() is performed at each of the n places where a lookup is done against the dictionary .

My question is which one is more efficient ? Just curious I guess...

Update: Note I don't need to know the original key that was inserted. Also the keys that are used are culture agnostic.

+1  A: 

Check this entry. It's still valid today.

Excerpt: From MSDN's "New Recommendations for Using Strings in Microsoft .NET 2.0"

ntziolis
The 4th and 5th DO on that page confuse me - what are they recommending ? What does 'linguistically relevant' mean ?
Gishu
They explain it in detail a little further down: http://msdn.microsoft.com/en-us/library/ms973919.aspx#stringsinnet20_topic5
ntziolis
+1  A: 

It's possible that upper-casing would be more efficient, as it could then do an ordinal comparison... but I very much doubt that this is a performance bottleneck for you. As ever, profile before you commit to a code change on the basis of performance.

Ultimately, specifying the string comparison:

  • Means you don't need to be as careful about how you use the dictionary
  • Means the original casing is maintained in the key, which can help diagnostically in some cases
  • Explicitly says up-front how you want the keys to be handled. You end up stating it just once, at the point of creation - which leads to clearer code IMO.
Jon Skeet
yup. Not worth the hassle of doing ToUpper() all over the place.
Gishu
+1  A: 

I don't know about the performance but I prefer the StringComparer option. With ToUpper you lose information (the original casing). True you might not need it, but one day you might and it doesn't feel like any more work to keep it (thus safe from the YAGNI principle).

I would also one day forget to call ToUpper and be in a world of hurt. But my unit tests would save me of course.

Mike Two
exactly. I have unit tests - so forgetting it in one place isn't gonna hurt for too long. Only till the next test run.
Gishu
A: 

I'd have accepted Oded's 'Go profile!' comment :) which was a Master-of-the-obvious suggestion.

From my test (source code here)

For 1M ContainsKey lookups,

  • ToUpper approach : 1123 msecs
  • Dictionary(OrdinalIgnoreCaseComparer) : 971 msecs

I find Dictionary with an injected Comparer option to perform better and less hassle. So no more ToUpper() for me.

Gishu