what is the difference between utf8 and latin1?
+2
A:
In latin1 each character is exactly one byte long. In utf8 a character can consist of more than one byte. Consequently utf8 has more characters than latin1 (and the characters they do have in common aren't necessarily represented by the same byte/bytesequence).
sepp2k
2010-04-25 16:42:23
+1
A:
UTF-8 is prepared for world domination, Latin1 isn't.
If you're trying to store non-Latin characters like Chinese, Japanese, Hebrew, Cyrillic, etc using Latin1 encoding, then they will end up as mojibake. You may find the introductory text of this article useful (and even more if you know a bit Java).
Note that MySQL doesn't support UTF-8 fully. It only goes up to 3 bytes, not 4 bytes per character. If you want full UTF-8 support, rather go for another RDBMS like PostgreSQL.
BalusC
2010-04-25 16:54:47
it is 4 byte per code point, not character.
Evan Carroll
2010-06-15 17:28:06