views:

88

answers:

3

I have a database that's currently running on a 5.0.27 server. I want to move to a new 5.1.41 server.

I mysqldump'd all the files. When restoring, I get an error

ERROR 1062 (23000) at line 21: Duplicate entry 'weiÃ' for key 'title'

I've narrowed the failure down to this script, which I can run and it fails:

--
-- Table structure for table `word`
--

set names utf8;

DROP TABLE IF EXISTS `word`;
CREATE TABLE `word`
(
  `wordid` int (10) unsigned NOT NULL auto_increment,
  `title` char (50) NOT NULL default '',
  PRIMARY KEY  (`wordid`),
  UNIQUE KEY `title` (`title`)
) ENGINE=MyISAM AUTO_INCREMENT=280707 DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;

--
-- Dumping data for table `word`
--

LOCK TABLES `word` WRITE;
INSERT INTO `word` VALUES
(198036,'weis'),
(241473, unhex('776569C39F'));
UNLOCK TABLES;

EDIT - changed to UNHEX.

I've checked and rechecked all the charset and collation variables between the two servers, and they look identical. Even if they weren't, I'm specifying the collation myself.

Any clues as to what I'm doing wrong here?

EDIT: here's the command I'm using to dump the database:

mysqldump --add-drop-table --add-locks --disable-keys --lock-tables --quick  -uusername -ppassword database > filename

and to load

mysql -D$MYSQL_DB -u$MYSQL_USER -p$MYSQL_PASSWD < filename

How can I check the collations for the client connections?

A: 

Are you using the mysqldump & mysql from 5.1 of 5.0?
you could try different combo's.

Could the sourceDB contain duplicate values in the "unique" column?
Remove the "UNIQUE KEY" constraint and check which records are duplicate in the targetDB.
This could give some insight into the problem.

Bob Fanger
+1  A: 

The specific problem is that in utf8_general_ci, 'weis' and 'weiß' are equivalent. If you want 'weiß' to be equal to 'weiss', then you should use utf8_unicode_ci. That will fix the problem on the import side (unless you have 'weiss' in the database, but then you really do have a duplicate).

At a guess, the original table has utf8_unicode_ci set, and you didn't notice the difference. If that's not true, I have no idea how your table got into the state it's in -- but switching to the right collation should solve your problem.

Dave Orr
+1  A: 

From a buddy on LiveJournal, I found out it's a bug "fix" between 5.0 and 5.1: They changed the collation. If you read the bug report, they actually broke it (weis and weiss should not be equivalent). But they're not going to unbreak it. So I will either have to change the collation (as Dave Orr suggests), or manually edit my data.

http://bugs.mysql.com/bug.php?id=27877

Eric H.