ansaurus

Question

Answer 1

+4 A:

You have 33 twice in Table B.

Either SELECT DISTINCT or GROUP BY col_a, ...:

SELECT DISTINCT * 
FROM    A 
JOIN    B ON ( A.col_a = B.col_a )
;

or

SELECT    * 
FROM      A 
JOIN      B ON ( A.col_a = B.col_a )
GROUP BY  col_a, col_b, col_c
;

You should clean up that table, though. Depending on how many occurrences of a repeated row, it might be faster to use a subquery:

SELECT  * 
FROM    A 
JOIN    (select distinct * from B) AS C
        ON ( A.col_a = C.col_a )
;

vol7ron 2010-10-07 23:34:24

notice the lack of `INNER` as `JOIN` by itself is an `INNER JOIN` by default.

vol7ron 2010-10-07 23:40:39

the subquery is not advised and depends on the database's query plan. but if you have **many** duplicate entries from both tables (for whatever awful reason), then the subquery would reduce the work involved.

vol7ron 2010-10-07 23:49:27

GROUP BY seems to do the trick (although there is some performance penalty)!

Theo Zographos 2010-10-08 00:01:28

`DISTINCT` and `GROUP BY` are going to come at a performance penlaty vs not having it, due to the extra check/logic involved in the distinction. That's one reason why it's important to have unique records only - something that can be solved by adding a `unique constraint` or distinguishing a `primary key`. The difference between the two is that `DISTINCT` is a `GROUP BY` with an `ORDER BY`, so there is an added penalty in sorting.

vol7ron 2010-10-08 04:50:56

Answer 2

+2 A:

The quick & dirty answer is:

select DISTINCT * from A inner join B on A.col_a = B.col_a

But the real question is, why do you have two identical entries in Table B?

Usually when you have to use DISTINCT, it indicates a problem in your data model.

Drew Hall 2010-10-07 23:34:59

Not all records are identical column to column, because there's a auto increment field in tables A and B. It's basically a logging application and the actual problem is more complex, but "duplicate" records are unavoidable :( Anyway, using DISTINCT * doesn't help

Theo Zographos 2010-10-07 23:50:27

@Theo Zographo: "Auto increment artificial primary key causes real life duplicates" is not a new story :(

onedaywhen 2010-10-08 07:15:20

ansaurus

tags:

views:

answers:

SQL JOIN question (yet another one)

related questions