Hi,
I'm writing a c# application that will process some text and provide basic query functions. In order to ensure the best possible support for other languages, I am allowing the users of the application to specify the System.Globalization.CultureInfo (via the "en-GB" style code) and also the full range of collation options using the System.Globalization.CompareOptions flags enum.
For regular string comparison I'm then using a combination of:
a) String.Compare overload that accepts the culture and options
b) For some bulk processes I'm caching the byte data (KeyData) from CompareInfo.GetSortKey (overload that accepts the options) and using a byte-by-byte comparison of the KeyData.
This seemed fine (although please comment if you think these two methods shouldn't be mixed), but then I had reason to use the HashSet<> class which only has an overload for IEqualityComparer<>.
MS documentation seems to suggest that I should use StringComparer (which implements both IEqualityComparer<> and IComparer<>), but this only seems to support the "IgnoreCase" option from CompareOptions and not "IgnoreKanaType", "IgnoreSymbols", "IgnoreWidth" etc.
I'm assuming that a StringComparer that ignores these other options could produce different hashcodes for two strings that might be considered the same using my other comparison options. I'd therefore get incorrect results from my application.
Only thought at the moment is to create my own IEqualityComparer<> that generates a hashcode from the SortKey.KeyData and compares eqality be using the String.Compare overload.
Any suggestions?