tags:

views:

445

answers:

6

Which is more efficient?

SELECT theField
FROM theTable
GROUP BY theField

or

SELECT DISTINCT theField
FROM theTable
+4  A: 

You can check the Execution Plan to look for the total cost of this statements. The answer may vary in different scenarios.

Ken Yao
+5  A: 

In most cases, DISTINCT and GROUP BY generate the same plans, and their performance is usually identical

Greg Dean
+7  A: 

Doesn't matter, it results in the same execution plan. (at least for these queries). These kind of questions are easy to solve, by enabling query analyzer or SSMS to show the execution plan and perhaps the server trace statistics after running the query.

Frans Bouma
+2  A: 

Hmmm...so far as I can see in the Execution Plan for running similar queries, they are identical.

Dana
A: 

you do have to know the difference between the two and not use them interchangeably.

Mladen Prajdic
+11  A: 

In your example, both queries will generate the same execution plan so their performance will be the same.

However, they both have their own purpose. To make your code easier to understand, you should use distinct to eliminate duplicate rows and group by to apply aggregate operators (sum, count, max, ...).

dub
I would add that adding distinct willy nilly to remove duplicate rows can be bad idea especially if it's use disguises a cross join or missing items in a where clause. Distinct would be my last choice to remove rows.
HLGEM
@HLGEM, when would you use Distinct?
wcm
@wcm, Imagine a Order table that has a Product field which contains the type of product ordered. You want to know what are all the different products that have been ordered. A simple select will return duplicates if the same product has been ordered multiple times. Use select distinct instead.
dub
@dub, I was asking @HGLEM under what conditions he would use DISTINCT since he is saying that it would be his last choice to remove duplicate rows when it is the only time I even consider it.
wcm
@wcm I understand HLGem's point. Consider a query to return all products purchased by a specific set of customers. INNER JOIN products to sales to customers gives duplicate rows, which could be eliminated by DISTINCT. Using WHERE EXISTS to prevent row duplication should reduce server workload.
Meff