ansaurus

Question

SQL - Only select row that is not duplicated

Answer 1

+1 A:

What is your database? In Oracle you could say

SELECT FROM your_table
WHERE rowid in
(SELECT MIN(rowid)
 FROM your_table
 GROUP BY MyField1, MyField2, MyField3);

Note that it is somewhat uncertain which of the rows with the same PK will be considered "first". If you need to impose a specific order, you need to additionally sort on the other columns.

Thilo 2009-02-10 00:46:11

Would this run the nested select statement once for each row in your_table? If it would, then you'll get pretty bad performance.Hopefully, the nested statement would be cached. Not that familiar with the query planning part of it.

Bassam 2009-02-10 01:20:46

I use MS SQL 2005 but I think this syntax will work, I'll try tomorrow and I'll let you know. Thx!

Melursus 2009-02-10 01:22:20

It would not be run for each row, just once.

Thilo 2009-02-10 01:30:59

there is no rowid field in MSSQL

Jk 2009-02-10 09:07:14

Answer 2

+1 A:

Not sure how you know which of row 2 and row 3 you want in the new table, but in mysql you can simply:

insert ignore into new_table (select * from old_table);

And the PK won't allow duplicate entries to be inserted.

Chris J 2009-02-10 00:52:35

Answer 3

+1 A:

First, you need to define what makes a row "first". I'll make up an arbitrary definition and you can change the SQL as you need to for what you want. For this example, I assume "first" to be the lowest value for MyField4 and if they are equal then the lowest value for MyField5. It also accounts for the possibility of all 5 columns being identical.

SELECT DISTINCT
     T1.MyField1,
     T1.MyField2,
     T1.MyField3,
     T1.MyField4,
     T1.MyField5
FROM
     MyTable T1
LEFT OUTER JOIN MyTable T2 ON
     T2.MyField1 = T1.MyField1 AND
     T2.MyField2 = T1.MyField2 AND
     T2.MyField3 = T1.MyField3 AND
     (
          T2.MyField4 > T1.MyField4 OR
          (
               T2.MyField4 = T1.MyField4 AND
               T2.MyField5 > T1.MyField5
          )
     )
WHERE
     T2.MyField1 IS NULL

If you also want to account for PKs that are not duplicated in the source table, but already exist in your destination table then you'll need to account for that too.

Tom H. 2009-02-10 01:22:15

Answer 4

A:

CREATE TABLE #A(
ID INTEGER IDENTITY,
[MyField1] [int] NULL,
[MyField2] [varchar](10) NULL,
[MyField3] [varchar](10) NULL,
[MyField4] [varchar](10) NULL,
[MyField5] [varchar](10) NULL
) 

INSERT INTO #A (MyField1,MyField2,MyField3,MyField4,MyField5) SELECT * FROM A

insert into B 
   select MyField1,MyField2,MyField3,MyField4,MyField5 from #A a1 
    where not exists (select id from #A a2 where a2.MyField1 = a1.MyField1 and a2.ID < a1.ID)

DROP TABLE #A

OR

insert into b
  select distinct * from a a1 
    where not exists (
  select a2.MyField1 from a a2 where a1.MyField1 = a2.MyField1 and 
       (a1.MyField2 < a2.MyField2 or a1.MyField3 < a2.MyField3 
        or a1.MyField4 < a2.MyField5 or a1.MyField5 < a2.MyField5))

Jk 2009-02-10 09:04:37

ansaurus

tags:

views:

answers:

SQL - Only select row that is not duplicated

related questions