tags:

views:

23

answers:

1

I would like to rid new entries into my database of Latin-1 characters and just allow UTF-8. I plan to Alter table and make the following changes:

Charset: latin1 -> utf-8

Collation: latin1_swdish_ci -> utf8_general_ci

The table in question has 1 million rows. Is this a good idea? What are the risks of doing this? What happens to data that I try to input that is not in utf-8? What happens to data that has been previously entered that is not utf-8?

+3  A: 

You should create a table duplicate, start a transaction, insert all the rows from the old table, and then commit. The safest way.

To summarize:

CREATE TABLE duplicate ... (with the charset you like, etc)

SET AUTOCOMMIT=0;
START TRANSACTION;
INSERT INTO duplicate (...field-list...)
   SELECT (...field-list...) from original_table;
COMMIT;

ALTER TABLE original RENAME TO original_backup;
ALTER TABLE duplicate RENAME TO original;

You must be careful with unique indexes and autoincrement fields. Be sure to create the duplicate table without indexes, to make the inserts quick, then add them.

santiagobasulto
Appreciate the answer santiagobasulto. I'll do exactly as you suggested. :)
brant