tags:

views:

689

answers:

4

I have a table in MySQL that has 3 fields and I want to enforce uniqueness among two of the fields. Here is the table DDL:

CREATE TABLE `CLIENT_NAMES` (
`ID` int(11) NOT NULL auto_increment,
`CLIENT_NAME` varchar(500) NOT NULL,
`OWNER_ID` int(11) NOT NULL,
PRIMARY KEY  (`ID`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The ID field is a surrogate key (this table is being loaded with ETL). The CLIENT_NAME is a field that contains names of clients The OWNER_ID is an id indicates a clients owner.

I thought I could enforce this with a unique index on CLIENT_NAME and OWNER_ID,

ALTER TABLE `DW`.`CLIENT_NAMES` 
ADD UNIQUE INDEX enforce_unique_idx(`CLIENT_NAME`, `OWNER_ID`);

but MySQL gives me an error: Error executing SQL commands to update table. Specified key was too long; max key length is 765 bytes (error 1071)

Anyone else have any ideas?

A: 

Have you looked at CONSTRAINT ... UNIQUE?

J D OConal
+9  A: 

MySQL cannot enforce uniqueness on keys that are longer than 765 bytes (and apparently 500 UTF8 characters can surpass this limit).

  1. Does CLIENT_NAME really need to be 500 characters long? Seems a bit excessive.
  2. Add a new (shorter) column that is hash(CLIENT_NAME). Get MySQL to enforce uniqueness on that hash instead.
Gili
Unicode is a variable length encoding. Characters can be one to four bytes in size and only the original 128 ASCII characters are encoded in one byte.
Joe Mahoney
Joe, I assume you meant UTF8, not Unicode...?
Gili
A: 

Here. For the UTF8 charset, MySQL may use up to 3 bytes per character. CLIENT_NAME is 3 x 500 = 1500 bytes. Shorten CLIENT_NAME to 250.

later: +1 to creating a hash of the name and using that as the key.

Terry Lorber
Saying that UTF-8 uses 3 bytes/char is wrong. As the page yourself pointed out said, it can take UP TO that value, and what makes it impossible to use on MySQL is the pessimistic approach.
André Neves
Thanks for the clarification.
Terry Lorber
A: 

Something seems a bit odd about this table; I would actually think about refactoring it. What do ID and OWNER_ID refer to, and what is the relationship between them?

Would it make sense to have

CREATE TABLE `CLIENTS` (
`ID` int(11) NOT NULL auto_increment,
`CLIENT_NAME` varchar(500) NOT NULL,
# other client fields - address, phone, whatever
PRIMARY KEY  (`ID`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `CLIENTS_OWNERS` (
`CLIENT_ID` int(11) NOT NULL,
`OWNER_ID` int(11) NOT NULL,
PRIMARY KEY  (`CLIENT_ID`,`OWNER_ID`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

I would really avoid adding a unique key like that on a 500 character string. It's much more efficient to enforce uniqueness on two ints, plus an id in a table should really refer to something that needs an id; in your version, the ID field seems to identify just the client/owner relationship, which really doesn't need a separate id, since it's just a mapping.

Aeon