ansaurus

Question

Answer 1

+1 A:

Tony Andrews 2008-10-28 14:52:45

This will only work if the OID column is enable by default in the table creation, which is not the case anymore with the newer versions of PostgreSQL.

gizmo 2008-10-28 14:57:32

I do not have oid field. I just try to select it and it doesn'T work.

Daok 2008-10-28 14:57:54

Answer 2

A:

One possible answer is:

CREATE <temporary table> (<correct structure for table being cleaned>);
BEGIN WORK;   -- if needed
INSERT INTO <temporary table> SELECT DISTINCT * FROM <source table>;
DELETE FROM <source table>
INSERT INTO <source table> SELECT * FROM <temporary table>;
COMMIT WORK;  -- needed
DROP <temporary table>;

I'm not sure whether the 'work' is needed on transaction statements, nor whether the explicit BEGIN is necessary in PostgreSQL. But the concept applies to any DBMS.

The only thing to beware of is referential constraints and in particular triggered delete operations. If those exist, this may prove less satisfactory.

Jonathan Leffler 2008-10-28 14:58:14

Answer 3

+4 A:

This is one of many reasons that all tables should have a primary key (not necessarily an ID number or IDENTITY, but a combination of one or more columns that uniquely identifies a row and which has its uniqueness enforced in the database).

Your best bet is something like this:

SELECT field1, field2, field3, count(*) 
INTO temp_table1
FROM table1
GROUP BY field1, field2, field3 having count(*) > 1

DELETE T1
FROM table1 T1
INNER JOIN (SELECT field1, field2, field3
      FROM table1
      GROUP BY field1, field2, field3 having count(*) > 1) SQ ON
            SQ.field1 = T1.field1 AND
            SQ.field2 = T1.field2 AND
            SQ.field3 = T1.field3

INSERT INTO table1 (field1, field2, field3)
SELECT field1, field2, field3
FROM temp_table1

DROP TABLE temp_table1

Tom H. 2008-10-28 14:59:14

I am currently trying your suggestion. +1

Daok 2008-10-28 15:25:09

It works thx, I have modified few thing like DELETE T1 doesnt work, I had to put the T1 after FROM table1 as T1. Few things like that. But the solution was fast and successful here.

Daok 2008-10-28 16:33:14

I'm glad that it worked for you. Thanks for pointing out my overlooking the T1 alias. I've corrected it in the script.

Tom H. 2008-10-28 19:46:27

Answer 4

A:

This will use the OID Object ID (if the table was created with it):

DELETE FROM table1
WHERE OID NOT IN (SELECT   MIN (OID)
                              FROM table1
                          GROUP BY field1, field2, field3)

Brian Schmitt 2008-10-28 15:00:57

Answer 5

A:

Well I should misunderstand something but I'll say :

SELECT DISTINCT field1, field2, field3 FROM table1

Too easy to be good? ^^

2008-10-28 15:04:28

I think Daok wants to delete all but one of the duplicate entries.

Vijay Dev 2008-10-28 15:08:37

Answer 6

A:

Using TSQL, no idea if Postgre supports temp tables but you could select into a temp table, and then loop through and delete and insert your results back into the original

-- **Disclaimer** using TSQL
-- You could select your records into a temp table with a pk
Create Table #dupes
([id] int not null identity(1,1), f1 int, f2 int, f3 int)

Insert Into #dupes (f1,f2,f3) values (1,2,3)
Insert Into #dupes (f1,f2,f3) values (1,2,3)
Insert Into #dupes (f1,f2,f3) values (1,2,3)
Insert Into #dupes (f1,f2,f3) values (2,3,4)
Insert Into #dupes (f1,f2,f3) values (4,5,6)
Insert Into #dupes (f1,f2,f3) values (4,5,6)
Insert Into #dupes (f1,f2,f3) values (4,5,6)
Insert Into #dupes (f1,f2,f3) values (7,8,9)

Select f1,f2,f3 From #dupes

Declare @rowCount int
Declare @counter int
Set @counter = 1
Set @rowCount = (Select Count([id]) from #dupes)

while (@counter < @rowCount + 1)
    Begin
       Delete From #dupes
       Where [Id] <> 
      (Select [id] From #dupes where [id]=@counter)
       and
      (
       [f1] = (Select [f1] from #dupes where [id]=@counter)
       and
       [f2] = (Select [f2] from #dupes where [id]=@counter)
       and
       [f3] = (Select [f3] from #dupes where [id]=@counter)
      )
       Set @counter = @counter + 1
    End

Select f1,f2,f3 From #dupes -- You could take these results and pump them back into --your original table

Drop Table #dupes

Tested this on MS SQL Server 2000. Not familiar with Postgre's options but maybe this will lead you in a right direction.

RandomNoob 2008-10-28 15:05:45

Answer 7

A:

This is the simplest method I've found:

Postgre SQL syntax:

CREATE TABLE tmp AS SELECT distinct * FROM table1
truncate table table1
insert into table1 select * from tmp
drop table tmp

T-SQL syntax:

select distinct * into #tmp from table1
truncate table table1
insert into table1 select * from #tmp
drop table #tmp

Aaron Palmer 2008-10-28 15:09:07

Truncate will not work if there are foreign key constraints

RandomNoob 2008-10-28 15:38:08

right, drop and re-add constraints after the truncate

Aaron Palmer 2008-10-28 15:46:28

Answer 8

A:

A good Answer for this problem, but for SQL Server. It uses the ROWCOUNT that SQL Server offers, to good effect. I have never used Postgre SQL and hence don't know the equivalent of ROWCOUNT in Postgre SQL.

Vijay Dev 2008-10-28 15:23:19

ansaurus

tags:

views:

answers:

Remove duplicate from a table

related questions