views:

23210

answers:

5

While executing an INSERT statement with many rows, I want to skip duplicate entries that would otherwise cause failure. After some research, my options appear to be the use of either:

  • "ON DUPLICATE KEY UPDATE" which implies an unnecessary update at some cost, or
  • "INSERT IGNORE" which implies an invitation for other kinds of failure to slip in unannounced.

Am I right in these assumptions? What's the best way to simply skip the rows that might cause duplicates and just continue on to the other rows?

Thanks!

A: 

Replace Into seems like an option. Or you can check with IF NOT EXISTS(QUERY) Then INSERT This will insert or delete then insert. I tend to go for a IF NOT EXISTS check first.

Replace

NTulip
Thanks for the quick reply. I'm assuming all over the place, but I assume this would be similar to ON DUPLICATE KEY UPDATE in that it would perform unnecessary updating. It appears wasteful, but I'm not sure. Any of these should work. I'm wondering if anyone knows which is best.
ThomasGHenry
NTuplip - that solution is still open to race conditions from inserts by concurrent transactions.
Chris KL
+27  A: 

I would recommend using INSERT...ON DUPLICATE KEY UPDATE.

If you use INSERT IGNORE, then the row won't actually be inserted if it results in a duplicate key. But the statement won't generate an error. It generates a warning instead. These cases include:

  • Inserting a duplicate key in columns with PRIMARY KEY or UNIQUE constraints.
  • Inserting a NULL into a column with a NOT NULL constraint.
  • Inserting a row to a partitioned table, but the values you insert don't map to a partition.

If you use REPLACE, MySQL actually does a DELETE followed by an INSERT internally, which has some unexpected side effects:

  • A new auto-increment ID is allocated.
  • Dependent rows with foreign keys may be deleted (if you use cascading foreign keys) or else prevent the REPLACE.
  • Triggers that fire on DELETE are executed unnecessarily.
  • Side effects are propagated to replication slaves too.

correction: both REPLACE and INSERT...ON DUPLICATE KEY UPDATE are non-standard, proprietary inventions specific to MySQL. ANSI SQL 2003 defines a MERGE statement that can solve the same need (and more), but MySQL does not support the MERGE statement.

Bill Karwin
Nice heads up on the side effects! Thanks. I don't mind that the duplicate rows won't get inserted. That's actually what I want. What else is ignored by INSERT IGNORE though?
ThomasGHenry
insert ignore i believe ignores errors
NTulip
It turns errors into warnings. I added some details above, for cases of errors that are ignored when you use INSERT IGNORE. There might be other cases too.
Bill Karwin
how to add this as my favorite?
Jayapal Chandran
@Jayapal: You can vote for my answer by clicking the gray up arrow above my current score. You can choose this whole page as a favorite by clicking the five-pointed star below the question's score.
Bill Karwin
ok. i have already gave a score. I thought i can add your answer as favorite. as you had mentioned i have added the question as favorite.
Jayapal Chandran
Right, SO doesn't have a feature for choosing an answer as a favorite, only a question. The only other thing I can think of is that you can select the `link` permalink below my answer, and bookmark it in your browser.
Bill Karwin
+2  A: 

I routinely use INSERT IGNORE, and it sounds like exactly the kind of behavior you're looking for as well. As long as you know that rows which would cause index conflicts will not be inserted and you plan your program accordingly, it shouldn't cause any trouble.

David Zaslavsky
I'm concerned that I'll ignore errors other than duplication. Is this correct or does INSERT IGNORE only ignore only ignore the duplication failure?Thanks!
ThomasGHenry
It turns any error into a warning. See a list of such cases in my answer.
Bill Karwin
+1  A: 

ON DUPLICATE KEY UPDATE is not really in the standard. It's about as standard as REPLACE is. See SQL MERGE.

Essentially both commands are alternative-syntax versions of standard commands.

Chris KL
A: 

I know this is old, but I'll add this note in case anyone else (like me) arrives at this page while trying to find information on INSERT..IGNORE.

As mentioned above, if you use INSERT..IGNORE, errors that occur while executing the INSERT statement are treated as warnings instead.

One thing which is not explicitly mentioned is that INSERT..IGNORE will cause invalid values will be adjusted to the closest values when inserted (whereas invalid values would cause the query to abort if the IGNORE keyword was not used).

Chris