tags:

views:

34

answers:

1

I'm trying to switch my site over to UTF-8 completely, so I don't have to deal with utf8_encode() & utf8_decode() functions.

I have the collation of my tables set properly, and I'm temporarily using the query SET NAMES utf8 to override the my.cnf file.

My question is — there are a ton of character set and collation variables in my.cnf, and I suspect that some ought to be left alone... which ones should I change to achieve the effect of SET NAMES utf8?

(The collation of my tables is utf8_unicode_ci.)

character_set_client | latin1 |
character_set_connection | latin1 |
character_set_database | latin1 |
character_set_filesystem | binary |
character_set_results | latin1 |
character_set_server | latin1 |
character_set_system | utf8 |

collation_connection | latin1_swedish_ci |
collation_database | latin1_swedish_ci |
collation_server | latin1_swedish_ci |
A: 

Well, collation is primarily for sorting, so unless you're storing a lnaguage with specific sorting needs, utf8_unicode_ci should be fine.

The character_set_* values are used for all other string operations internally - value checks in places like WHERE clauses or IF/CASE statement, string functions like CHAR_LENGTH(), REPLACE(), SUBSTRING() - that sort of stuff.

Generally speaking, they should all be the same (in this case, utf8) except for filesystem - I'd recommend keeping that at binary unless you have a specific need to move away from that.

Peter Bailey
It was definitely the filesystem/binary setting that inspired this question; that, and I couldn't find any documentation about what specifically each of these did. I mean, my instinct was just to change them all to utf8/utf8_unicode_ci — but the binary setting sapped my confidence. I was aware that collation is different, but at the same time, having latin1_swedish_ci didn't make sense to me at all, so I switched it to UTF-8 as well — if only for consistency's sake. Anyway, thanks for the answer... I think I'll change all except `character_set_filesystem`.
JKS