ansaurus

Question

Enforcing uniqueness on PostgreSQL table column after non-unique values already inserted

Answer 1

+4 A:

The query you're looking for is:

select distinct on (my_unique_1, my_unique_2) * from my_table;

This selects one row for each combination of columns within distinct on. Actually, it's always the first row. It's rarely used without order by since there is no reliable order in which the rows are returned (and so which is the first one).

Combined with order by you can choose which rows are the first (this leaves rows with the greatest last_update_date):

 select distinct on (my_unique_1, my_unique_2) * 
 from my_table order by my_unique_1, my_unique_2, last_update_date desc;

Now you can select this into a new table:

 create table my_new_table as
 select distinct on (my_unique_1, my_unique_2) * 
 from my_table order by my_unique_1, my_unique_2, last_update_date desc;

Or you can use it for delete, assuming row_id is a primary key:

 delete from my_table where row_id not in (
     select distinct on (my_unique_1, my_unique_2) row_id 
     from my_table order by my_unique_1, my_unique_2, last_update_date desc);

Konrad Garus 2010-07-21 06:29:21

+1 DISTINCT ON is a very handy PostgreSQL feature

leonbloy 2010-07-21 14:04:18

About "the first row": Without an ORDER BY, there is no way to tell which row will come back first, so the "first row" is a misleading term as you may not always get the same result. A DISTINCT ON is pretty much useless without an ORDER BY clause.

Matthew Wood 2010-07-21 14:34:20

Thanks, updated to make this more explicit.

Konrad Garus 2010-07-21 14:48:16

I read about distinct, but I tried use it with `Limit 1000` as well, just to check the output. Was taking forever, but I assume that's because I had to remove indexes temporarily to insert more data quickly. Thanks for the clear example, but I'm confused about the `my_unique` columns after `distinct on`. The docs say that those should be expressions, so does including the columns as expressions just make sure they are present in the record? I ask because I actually need to make sure those columns are not just present, but jointly unique.

ehsanul 2010-07-21 22:16:04

ansaurus

tags:

views:

answers:

Enforcing uniqueness on PostgreSQL table column after non-unique values already inserted

related questions