views:

90

answers:

1

I have a couple tables that are set to the latin1 character set but I suspect have been erroneously been inserted with some values that are actually encoded using utf8.

MySQL makes this a little more complicated because it silently converts everything based on your connection settings.

How can I test my hypothesis that there are some utf8-encoded bytes in a latin1 column in MySQL?

+1  A: 

If you find strings of 2 bytes which match the following bit pattern:

110xxxxx 10xxxxxx

chances are that these are utf-8 characters. It is possible that they are 2 consecutive non-ascii latin-1 characters (like 'Ä' or something unprintable), but that is unlikely.

KenE