views:

72

answers:

2

I have a situation where I want to insert a row if it doesn't exist, and to not insert it if it already does. I tried creating sql queries that prevented this from happening (see here), but I was told a solution is to create constraints and catch the exception when they're violated.

I have constraints in place already. My question is - how can I catch the exception and continue executing more queries? If my code looks like this:

cur = transaction.cursor()

#execute some queries that succeed

try:
    cur.execute(fooquery, bardata)  #this query might fail, but that's OK
except psycopg2.IntegrityError:
    pass

cur.execute(fooquery2, bardata2)

Then I get an error on the second execute:

psycopg2.InternalError: current transaction is aborted, commands ignored until end of transaction block

How can I tell the computer that I want it to keep executing queries? I don't want to transaction.commit(), because I might want to roll back the entire transaction (the queries that succeeded before).

+3  A: 

I think what you could do is use a SAVEPOINT before trying to execute the statement which could cause the violation. If the violation happens, then you could rollback to the SAVEPOINT, but keep your original transaction.

Here's another thread which may be helpful: http://stackoverflow.com/questions/2370328/continuing-a-transaction-after-primary-key-violation-error

dcp
+1  A: 

I gave an up-vote to the SAVEPOINT answer--especially since it links to a question where my answer was accepted. ;)

However, given your statement in the comments section that you expect errors "more often than not," may I suggest another alternative?

This solution actually harkens back to your other question. The difference here is how to load the data very quickly into the right place and format in order to move data around a single SELECT -and- is generic for any table you want to populate (so the same code could be used for multiple different tables). Here's a rough layout of how I would do it in pure PostgreSQL, assuming I had a CSV file in the same format of the table to be inserted into:

CREATE TEMP TABLE input_file (LIKE target_table);

COPY input_file FROM '/path/to/file.csv' WITH CSV;

INSERT INTO target_table
SELECT * FROM input_file
WHERE (<unique key field list>) NOT IN (
    SELECT <unique key field list>
    FROM target_table
);

Okay, this is a idealized example and I'm also glossing over several things (like reporting back the duplicates, pushing the data into the table via Python in-memory data, COPY from STDIN rather than via a file, etc.), but hopefully the basic idea is there and it's going to avoid much of the overhead if you expect more records to be rejected than accepted.

Matthew Wood
hmm interesting.. COPY might be what I'm looking for
Claudiu