views:

249

answers:

4

NO MySQL answers please!

The basic query is as follows (assume A is Key)

INSERT INTO destination (A,B,C)
SELECT a1,b1,c1 
FROM source
WHERE (selectconditions) ;

Source contains many records that may or may not already be in destination, which means that the insert will fail as soon as a duplicate record is encountered.

Desired Behaviour: INSERT or IGNORE

This is the desired scenario for the given problem. Insert if you can, otherwise continue.

Pseudo c#/java:

foreach(record in selectQuery) 
{  
   try { destination.insert(record) } 
   catch(insertionException){//squelch} 
}

This can be handled in SQL by adding

AND NOT EXISTS (SELECT A FROM destination INNER JOIN source on destination.A = source.a1)

to the end of the query -- In other words, check before you insert.

What are some other alternatives to handling this common situation? What are the pros and cons of these techniques?

+1  A: 

If you share a common Primary Key:

INSERT INTO destination 
( A, B, C)
SELECT a1, b1, c1 FROM source
WHERE source.pk not in ( SELECT pk FROM destination );

If you don't:

INSERT INTO destination 
( A, B, C)
SELECT a1, b1, c1 FROM source
WHERE a1 + b1 + c1 not in ( SELECT a+b+c FROM destination );
Nissan Fan
Comparing a1+b1+c1 to a+b+c is almost certainly a bad idea. What if the values were: a1=1, b1=2, c1=3, a=2, b=3, c=1? Not to mention the loss in performance by concatenating or adding these columns possibly ruining the use of indexes.
Tom H.
+1  A: 

Some database provide an explicit syntax for operations that involve a conditional insert/update/ignore.

Oracle and SQLServer, for example have the MERGE statement which can insert/update/delete/ignore a record based on a set of predicates.

Ignoring database-specific syntax, you can perform the insert using a predicate that excludes records that already exist:

INSERT INTO target( A, B, C )
SELECT SA, SB, SB FROM source
WHERE NOT EXISTS (select A, B, C from TARGET where A = SA, B = SB, C = SC)
LBushkin
A: 

I would probably do the following:

INSERT INTO Target (A, B, C)
SELECT
     S.A, S.B, S.C
FROM
     Source S
LEFT OUTER JOIN Target T ON
     T.A = S.A AND
     T.B = S.B AND
     T.C = S.C
WHERE
     T.A IS NULL
Tom H.
A: 

If you're using MySQL and have the luxury of forcing non-duplicate keys using a UNIQUE index, you can use INSERT ON DUPLICATE KEY UPDATE with an idempotent (no-op) update for duplicates.

INSERT INTO Target (A, B, C) (SELECT a1, b1, c1 FROM Source) ON DUPLICATE KEY UPDATE A=A;

This has the advantage of being very fast and not requiring an extra SELECT.

Duncan Beevers
MySQL has some nice features in it's non-standard SQL. It would be nice if other vendors copied them.
chris