views:

151

answers:

3

which database should I use, if my application is going to be in multiple languages (including Chinese, Japanese etc)? In other words, is MySQL better or worse than Postgres to handle unicode etc? (these are the only two databases my hosting company has)

Also, which language is better for handling unicode? PHP or Ruby/Rails?

+1  A: 

Databases:

In my experience, both MySQL and Postgres handled Unicode quite okay.

Languages:

  • PHP <=5.3 is somewhat restricted in regard to unicode, while the upcoming PHP 6 promises to fix that.
  • Ruby 1.8 had poor unicode support, but 1.9 is said to handle unicode nicely (I have no own experience there).

Restrictions of PHP 5.3:

It is possible to use PHP 5 together with unicode, but then one has to be careful with strings. Many string functions, for instance the one which counts the length of characters, returns the number of bytes, not the number of characters. There are are additional functions available though, which return the number of characters.

If this is a new project you should be able to use the right functions from the start.

For our existing PHP project, I decided to leave it where it was (Windows 1252), due to it being an internal project and limited value for unicode (sadly, I really love unicode).

thomastiger
If your db driver doesn't support unicode, you can read and write raw byte arrays and convert them to strings using your desired encoding.
Seun Osewa
+5  A: 

Mysql does not have full support for Unicode (eg: Chinese characters outside the BMP) (see). It uses a crippled 'utf8' encoding (not really UTF-8).

Postgresql seems to have full support for UTF-8 encoding (hence, for all Unicode characters).

leonbloy
Totally true. MySQL stores 1-3 bytes and PostgreSQL 1-4 bytes per UTF-8 char. Go for PostgreSQL.
BalusC
A: 

In my cases Postgresql was better than MySQL, MySQL do not completely support Unicode

Nasser Hadjloo
That was already answered. Just upvote the answer which you totally agree instead of reposting it once again.
BalusC