ansaurus

Question

Best approach to SQL Server query using distinct

Answer 1

+1 A:

And the problem using

SELECT DISTINCT * FROM TableA A INNER JOIN TableB B ON A.idA = B.IdA 
AND B.Date = '2009-01-10' ORDER BY A.Id;

is?

If it's that it does return duplicate idA values, that's because you are selecting too many columns, if you can't reduce what you select, you need the subquery.

Vinko Vrsalovic 2009-01-09 10:00:45

'Distinct *' is actually '*', if there are no exact duplicates.

Adeel Ansari 2009-01-09 10:27:27

Exactly, that's what I'm saying.

Vinko Vrsalovic 2009-01-09 10:54:00

Especially if these are large tables, the derived table approach where the records from tableb are limited before the join may save time. That's one reason to try it. ALso since you aren't limiting the column list to table a columns, it still won't give the correct answer.

HLGEM 2009-01-09 14:18:07

The subquery will return only the id's, hence why it is "DISTINCT IdA". Of course a Distinct * will return probably the same as * unless all the records are the same (which is not the case). Thanks.

Martín Marconcini 2009-01-14 11:07:40

Answer 2

+2 A:

use a derived table

SELECT * FROM TableA 
JOIN
(SELECT DISTINCT IdA FROM TableB WHERE Date = '20090110') a
ON a.IDA = TAbleA.IDA
ORDER BY TableA.IdA

HLGEM 2009-01-09 14:15:39

Answer 3

A:

How about this :

select * from TableA a 
  where a.TableA_Id in
(select TableA_Id from TableB where [Date] = '2009-01-10')
order by a.TableA_Id

You could add distinct to the subquery if you want, I'm not sure whether this will improve or reduce preformance offhand.

inferis 2009-01-09 14:24:05

Answer 4

+1 A:

I think a normal exists statement is what you need:

SELECT * 
FROM TableA A 
WHERE Exists( select B.IdA from TableB B where A.IdA = B.IdA and B.Date = '2009-01-10' )
ORDER BY A.Id;

Performance-wise it should be the best approach.

If you need values from the other table, and to avoid using distinct, you could join to a sub query that is grouped like so:

Select TableA.*, groupedTableB.OtherData 
From TableA
Inner join 
(
    select TableB.IdA, Sum(TableB.Data) SummaryData
    from TableB where TableB.Date = '2009-01-10'
    group by TableB.IdA

) groupedTableB on groupedTableB.IdA = TableA.IdA

Dan 2009-01-09 14:38:12

Just so you know, that second on is not a subquery, it is a derived table.

HLGEM 2009-01-09 15:23:23

Thanks for the idea. Will try this too

Martín Marconcini 2009-01-14 11:05:30

ansaurus

tags:

views:

answers:

Best approach to SQL Server query using distinct

related questions