views:

7180

answers:

5

I started by googling, and found this article which talks about mutex tables.

I have a table with ~14 million records. If I want to add more data in the same format, is there a way to ensure the record I want to insert does not already exist without using a pair of queries (ie, one query to check and one to insert is the result set is empty)?

Does a unique constraint on a field guarantee the insert will fail if it's already there?

It seems that with merely a constraint, when I issue the insert via php, the script croaks.

+3  A: 

Any simple constraint should do the job, if an exception is acceptable. Examples :

  • primary key if not surrogate
  • unique constraint on a column
  • multi-column unique constraint

Sorry is this seems deceptively simple. I know it looks bad confronted to the link you share with us. ;-(

But I neverleless give this answer, because it seem to fill your need. (If not, it may trigger your updating your requirements, which would be "a Good Thing"(TM) also).

Edited: If an insert would break the database unique constraint, an exception is throw at the database level, relayed by the driver. It will certainly stop your script, with a failure. It must be possible in PHP to adress that case ...

KLE
i added a clarification to the question - does your answer still apply?
warren
I believe it does. A unique constraint will cause the failure of incorrect inserts. Note : you have to deal with this failure in your code, but this is quite standard.
KLE
for now I'm going to stick with the solution I accepted - but will further look into handling INSERT failures etc as the app grows
warren
+11  A: 

use INSERT IGNORE INTO table

see http://bogdan.org.ua/2007/10/18/mysql-insert-if-not-exists-syntax.html

there's also INSERT … ON DUPLICATE KEY UPDATE syntax, you can find explanations on dev.mysql.com

knittl
that looks *exactly* like what i wanted to find
warren
and can I combine that with "delayed" to speed the script up?
warren
yes, insert delayed might speed up things for you. try it out
knittl
update - I did an `insert ignore delay` and the php script finishes *much* faster by offloading all of the work onto mysql; for my current purposes, this is perfect; thanks
warren
great, glad to help
knittl
Exactly what I was looking for. +1. Landed here from Google.
Amarghosh
+1  A: 

Replace might work for you.

nikc
thanks for the idea - i'd prefer to not replace 14 million records (or how many it grows to) when adding new data sources :)
warren
+2  A: 

on duplicate key update, or insert ignore can be viable solutions with MySQL.

Zed
+1  A: 

Also i'm using

INSERT INTO table (value1,value2) SELECT 'stuff for value1','stuff for value2' WHERE NOT EXISTS (SELECT * FROM table WHERE value1='stuff for value1' AND value2='stuff for value2') LIMIT 1

Server