views:

322

answers:

5

I have a simple table set up with two columns, each column is a key value. the values stored in each field are varchar(45) representing an email address and a keyword. It is possible that the information collected may duplicate itself as it is related to site browsing data collection. To avoid duplicate entries, I used tried to use INSERT IGNORE into, REPLACE into, and finally I'm trying the following:

insert into <table name> (user_email, key_token) values ('<email>@<this>.com', 'discountsupplies') on duplicate key update user_email='<email>@<this>.com',key_token='discountsupplies';

but I am still seeing duplicate records being inserted into the table. The SQL that generated the table:

DROP TABLE IF EXISTS `<database name>`.`<table name>` ;

CREATE  TABLE IF NOT EXISTS `<database name>`.`<table name>` (
  `user_email` VARCHAR(45) NOT NULL ,
  `key_token` VARCHAR(45) NOT NULL,
  PRIMARY KEY (`user_email`, `key_token`) )
ENGINE = InnoDB;

While I saw several questions that were close to this one, I did not see any that addressed why this might be happening, and I'd like to figure out what I'm not understanding about this behavior. Any help is appreciated.

+2  A: 

You have a composite primary key on both columns.

This means that it's the combination of the fields is UNIQUE, not each field as is.

Thes data are possible in the table:

[email protected]  1
[email protected]  1
[email protected]  2

, since no combination of (user_email, key_token) repeats in the table, while user_email and key_token as themselves can repeat.

If you want each separate column to be UNIQUE, define the UNIQUE constraints on the fields:

CREATE  TABLE IF NOT EXISTS `<database name>`.`<table name>` (
  `user_email` VARCHAR(45) NOT NULL ,
  `key_token` VARCHAR(45) NOT NULL,
  PRIMARY KEY (`user_email`, `key_token`),
  UNIQUE KEY (user_email),
  UNIQUE KEY (key_token)
)
ENGINE = InnoDB;

Update

Having duplicates in a column marked as UNIQUE would be a level 1 bug in MySQL.

Could you please run the following queries:

SELECT  user_email
FROM    mytable
GROUP BY
        user_email
HAVING  COUNT(*) > 1

SELECT  key_token
FROM    mytable
GROUP BY
        key_token
HAVING  COUNT(*) > 1

and see if they return something?

Quassnoi
Thank you for the assistance. I just tried adding the UNIQUE KEY () statements and resetting the test database to try this. I am afraid I am still getting duplicates. Your example above is right on as far as what I am trying to achieve in the table with the data I have.
jonny
Tried both queries - I did get values back for both queries, but I did not get all the email addresses in the table nor did I get all the keywords - just some of each.
jonny
You didn't create two separate `UNIQUE KEYs`, you creates a single composite one. Issue this: `CREATE UNIQUE INDEX ix_mytable_useremail ON mytable (user_email)`
Quassnoi
A: 

PRIMARY KEY (user_email,key_token) means a combination of both will be unique but if you also want individual email and key_tokens to be unique you have to use UNIQUE seperately for each column..

PRIMARY KEY ('user_email', 'key_token'),
  UNIQUE KEY (user_email),
  UNIQUE KEY (key_token)
halocursed
A: 

As an addendum, After adding the UNIQUE KEY statements, I went back and tried both REPLACE and INSERT IGNORE to achieve my goal, and none of these options is excluding duplicate entries.

Also adding: UNIQUE INDEX (user_email, key_token) doesn't seem to help either.

I'm going to do this check via a manual look-up routine until I can figure this out. If I find an answer I'll be happy to update the post.

jonny
Could you post some data you condider duplicate?
Quassnoi
This is the first appearing snippet 'cut and pasted' from the table (email has been modified from original address for privacy) that demonstrates the duplication.The third record gets added even though the first record is clearly in the table. [email protected] [email protected] [email protected] tempurpedic-beds
jonny
You need to create *two* *separate* `UNIQUE KEY`s: `CREATE UNIQUE INDEX ix_mytable_useremail ON mytable (user_email)`, `CREATE UNIQUE INDEX ix_mytable_keytoken ON mytable (key_token)`
Quassnoi
A: 

Added Unique Index lines below the original create table statement -

-- -----------------------------------------------------
-- Table `<db name>`.`<table name>`
-- -----------------------------------------------------
DROP TABLE IF EXISTS `<db name>`.`<table name>` ;

CREATE  TABLE IF NOT EXISTS `<db name>`.`<table name>` (
  `user_email` VARCHAR(45) NOT NULL ,
  `key_token` VARCHAR(45) NOT NULL,
  PRIMARY KEY (`user_email`, `key_token`),
  UNIQUE KEY (user_email),
  UNIQUE KEY (key_token)
  )

ENGINE = InnoDB;

CREATE UNIQUE INDEX ix_<table name>_useremail on `<db name>`.`<table name>`(user_email);
CREATE UNIQUE INDEX ix_<table name>_keytoken on `<db name>`.`<table name>`(key_token);

it seems to be ok (no errors when creating tables during the source step), but I'm still getting duplicates when running the on duplicate query.

jonny
A: 

final solution for now: query table to get list of key_tokens by user_email, test current key_token against list entries, if found don't insert. Not optimal or pretty, but it works....

jonny
Quassnoi provided lots of help - had I gotten the situation with my table to work as I'd envisioned I'd have chosen one of his answers as final solution. This is what I ended up using so i'm choosing this as the non-optimal, but working solution.
jonny