What's the difference between utf8_general_ci and utf8_unicode_ci in MySQL?

views:

answers:

What's the difference between utf8_general_ci and utf8_unicode_ci in MySQL?

For a while now, I've used phpMyAdmin to manage my local MySQL databases. One thing I'm starting to pick up is the correct character sets for my database. I've decided UTF-8 is the best for compatibility (as my XHTML templates are served as UTF-8) but one thing that confuses me is the varied options for UTF-8 I'm presented with in the phpMyAdmin interface?

The two I've isolate are:

utf8_general_ci
utf8_unicode_ci

So my question is this: what is the difference between the general and unicode variants of utf8 in MySQL? (I've come to learn that ci is shorthand for case-insensitive)

Any help would be most grateful in this matter.

+1 A:

From the MySQL manual on Unicode Character Sets:

For any Unicode character set, operations performed using the _general_ci collation are faster than those for the _unicode_ci collation. For example, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons for utf8_unicode_ci. The reason for this is that utf8_unicode_ci supports mappings such as expansions; that is, when one character compares as equal to combinations of other characters. For example, in German and some other languages “ß” is equal to “ss”. utf8_unicode_ci also supports contractions and ignorable characters. utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.

See the referenced page for further information and examples.

Gumbo 2010-07-26 18:41:42

Thanks. I'll settle with `utf8_unicode_ci` in light of this.

Martin Bean 2010-07-26 18:45:52

The #@%!ing manual discusses this... :)

One of the issues is speed and accuracy of certain operations.

Assaf Lavie 2010-07-26 18:41:58

ansaurus

tags:

views:

answers:

What's the difference between utf8_general_ci and utf8_unicode_ci in MySQL?

related questions