Regarding MySql, is there a character set to support all or the vast majority of languages?
UTF-8 is an *encoding*, not a character set. Unicode is the character set.
Greg Hewgill
2009-10-07 18:49:54
Sure enough, but then again, “character set” is often used mistakenly instead of “encoding”. See what it's called in HTTP!
Arthur Reutenauer
2009-10-07 19:13:32
Yes, you're right. I didn't know there's a difference because the terms are usually used synonymously but it seems this is because most people don't know the difference. ;-) Sorry.
arno
2009-10-08 06:19:47
+8
A:
Unicode. It has several encodings: UTF-8, UTF-16 and UTF-32.
From http://en.wikipedia.org/wiki/UTF-8
UTF-8 (8-bit UCS/Unicode Transformation Format) is a variable-length character encoding for Unicode. It is able to represent any character in the Unicode standard, yet is backwards compatible with ASCII.
S.Lott
2009-10-07 18:44:06
Yes. It's optimized for Western European languages/ASCII compatibility but can represent any valid Unicode character.
DaveE
2009-10-07 19:11:55
+1
A:
As others have said, UTF-8. Go read Joel's blog post about Unicode and you'll understand why.
Esko
2009-10-07 18:48:51