tags:

views:

51

answers:

1

From what I understand, when MySQL compares a string stored in utf8_general collation, it first converts it's characters to their ascii equivalents. In other words ḩ = h, ţ = t, ā = a, í = i, etc...

Is there a mapping table which I could use to implement similar comparison function in php or javacript? I know there are alternatives in php such that iconv but their transliteration is slightly different, e.g. í = 'i.

Thank you.

+1  A: 

The usual approach is to normalise your string to Unicode Normal Form D (which puts diacritical marks in a separate character to the base letter), and then remove all characters with the unicode ‘combining diacritical’ class.

See normalizer_normalize to get normalisation in PHP. I'm not aware of a solution for JavaScript: there's nothing built in and you'd have to force the client to suck down some large Unicode character data tables.

bobince