views:

492

answers:

1

I am converting a website from ISO to UTF-8, so I need to convert the MySQL database too.

On the Internet, I read various solutions, I don't know wich one to choose.

Do I really need to convert my varchar columns to binary, then to UTF-8 like that:

ALTER TABLE t MODIFY col BINARY(150);
ALTER TABLE t MODIFY col CHAR(150) CHARACTER SET utf8;

It takes a long time to do that for each column, of each table, of each database.

I have 10 databases, with 20 tables each, wich around 2 - 3 varchar colums (2 queries each column), this gives me around 1000 queries to write! How come? How to do?

Resolved : I post the code that I have used:

PASSWORD=""
db=$1

mysqldump --password=$PASSWORD --set-charset --skip-set-charset --add-drop-table --databases "$db" > /home/dev/backup/bdd.sql

QUERY="ALTER DATABASE \`$db\` DEFAULT CHARACTER SET utf8;"
mysql --password=$PASSWORD --database "$db" -e "$QUERY"

mysql --password=$PASSWORD --default-character-set=utf8 < /home/dev/backup/bdd.sql

See answer below for more information.

+3  A: 

You can do that very easily using a dump. Make a dump using

mysqldump --skip-opt --set-charset --skip-set-charset 

Then create another database, set its default character set to UTF-8 and then load your dump back with:

mysql --default-character-set=<your iso encoding>

The main idea is to make a dump without any sign of data encoding.
So, at create time, the table's encoding would be inherit from the database encoding and set to UTF-8. And with --default-character-set we tell MySQL to recode our data automatically.

Col. Shrapnel
Oh this seems nice ! Why isn't that on the documentation, or nobody talks about that anywhere, thanks I will test it
Matthieu
@Matthieu as it seems you're going to make it with `--all-databases`, please make it with precautions. Make a backup and test whole algorithm on one database first.
Col. Shrapnel
Well I just tested (on one database only, to be sure) and this doesn't change the encoding of the columns (they are still in "latin"). Just to be sure, for the mysqldump, the 3 parameters don't need values ?
Matthieu
@Matthieu this is actually 2 parameters, as far as I understand. `--skip-opt --set-charset` tell database not to add default charset clause to the table definitions. And `--skip-set-charset` remove `SET NAMES` query from the top of the dump. How do you know of data encoding? utf-8 is very similar to latin1 What does it say of table encoding if you run `SHOW CREATE TABLE tablename` for the new table? is there any character sets in the dump? Do you name `utf8` properly (as it differ from usual `utf-8`)
Col. Shrapnel
Thanks I've got it working. What I had to do is dump, empty the database (drop tables, BUT KEEP DATABASE), set its encoding to UTF8, then re-import the content. Then the content is fully UTF8, else it doesn't work fully. I've posted the code above, in my question.
Matthieu
WARNING : I found out (the hard way) that the "--skip-opt" parameter makes the create table queries not include the auto_increment statement ! (http://groups.google.com/group/comp.databases.mysql/browse_thread/thread/01ccba8f07dc65fc?pli=1) (I don't blame you though, I should have tested more)
Matthieu