views:

37

answers:

3

Is there a way to get .Net to positively match strings, even if some characters are not exactly the same? Examples of characters that should be considered to be similar could be: 'a'/'á' and 'í'/'i'. The Chrome browser find-as-you-type recognizes these characters as being equivalent.

A: 

Sure its possible if you write out the algorithm yourself. The only thing close to doing what you speak with the OOB Regex.Match() overloads is in the RegexOptions, the CultureInvariant. But, unless you are flipping culture's that's not going to be of any use.

P.Brian.Mackey
A: 

Maybe you want to look into Soundex/Metaphone functions, to first normalise strings, and then perform your regex operations on the results of that?

Peter Boughton
+2  A: 

Take a look at this blog post by Michael Kaplan. The code here uses standard .NET class library methods for

  1. Normalising Unicode strings, in this case, using a "composite" normalisation form which ensures that a character like á is represented by separate code points for a and its diacritic(s);
  2. Identifying the diacritics using classes that expose databases of information about Unicode characters, and stripping them out.
shambulator