views:

420

answers:

1

Hi, I am importing a CSV file to postgres.

copy product from '/tmp/a.csv' DELIMITERS ',' CSV;
ERROR:  duplicate key value violates unique constraint "product_pkey"
CONTEXT:  COPY product, line 13: "1,abcd,100 pack"

What is the best way to avoid this error.. Would I have to write a python script to handle this error..

+1  A: 

Well, the best way would be to filter the data not to contain duplicates. It's usually pretty easy, and doesn't require a lot of programming.

For example:

Assuming 1st column of your data is data for primary key and the file is not very large (let's say les than 60% of your ram), you could:

awk -F, '(!X[$1]) {X[$1]=1; print $0}' /tmp/a.csv > /tmp/b.csv

and load /tmp/b.csv instead.

If the file is larger, then I would suggest something like this:

sort /tmp/a.csv | awk -F, 'BEGIN {P="\n"} ($1 != P) {print $0; P=$1}' > /tmp/b.csv
depesz