views:

57

answers:

2

I know of two ways to insert without duplication. The first is using a WHERE NOT EXISTS clause:

INSERT INTO table_name (col1, col2, col3)
SELECT %s, %s, %s
WHERE NOT EXISTS (
    SELECT * FROM table_name AS T
    WHERE T.col1 = %s
      AND T.col2 = %s)

the other is doing a LEFT JOIN:

INSERT INTO table_name (col1, col2, col3)
SELECT %s, %s, %s
FROM ( SELECT %s, %s, %s ) A
LEFT JOIN table_name B
ON  B.COL1 = %s
AND B.COL2 = %s
WHERE B.id IS NULL
LIMIT 1

Is there a general rule as to one being faster than the other, or does it depend on the tables? Is there a different way which is better than both?

+3  A: 

I would recommend defining a UNIQUE constraint on the column(s) you need to be unique (col1 & col2 in this case), and then just do the INSERT. Handle exceptions as needed.


Re your comment about the exception demanding a rollback, the solution for PostgreSQL is to set a transaction savepoint before you try the insert that may cause an exception. If you get the exception, rollback to the savepoint.

See:

Bill Karwin
i have the unique constraints in place already. however, when they're violated, i have to either commit or rollback the entire transaction so far. each transaction of mine requires multiple inserts, though, and i don't want to partially commit some of them. is there any way to handle exceptions in a better way, that doesn't mess up the cursor/transaction? (doing this from Python, with sqlobject or psycopg2)
Claudiu
Yes: *catch the exception*. If it's a duplicate key violation, then you can ignore it because it means your row already exists. If it's another type of exception, you would have had that problem anyway (e.g. SQL syntax error, disk full, network connection broken).
Bill Karwin
if i catch the exception, then attempt to execute another query with the same cursor, i get a `psycopg2.InternalError: current transaction aborted; ignoring actions until end of transaction block`, or something of the sort
Claudiu
Here is the exact error message: `psycopg2.InternalError: current transaction is aborted, commands ignored until end of transaction block`. what can i do to prevent this state after failing to insert a table?
Claudiu
i've tried to ask the question more eloquently here: http://stackoverflow.com/questions/3120688/sql-continue-executing-queries-after-duplicate-key-violation
Claudiu
A: 

I think using EXISTS is more efficient!You could do like this:

if exists(select 1 from table_name where col1 = %s and col2 = %s) then
  insert into table_name (col1, col2, col3)
  select %s, %s, %s;
end if;

under test,using EXISTS is about 50 times faster then using NOT EXISTS.

another method is using EXCEPT .

INSERT INTO table_name (col1, col2, col3)
SELECT %s, %s, %s
except
select col1, col2, col3 from table_name

under test,using EXCEPT is about 3 times faster then using NOT EXISTS.

tinychen