ansaurus

Question

What's faster, SELECT DISTINCT or GROUP BY in MySQL?

Answer 1

A:

I think MySQL is smart enough to do the same for both of them.

Mehrdad Afshari 2009-02-24 12:08:31

Answer 2

+15 A:

They are essentially equivalent to each other (in fact this is how some databases implement DISTINCT under the hood).

If one of them is faster, it's going to be DISTINCT (especially in case you have an index on profession). This is because, although the two are the same, an optimizer would have to catch the fact that your group-by is not taking advantage of any group members, just their keys. DISTINCT makes this explicit, so you can get away with a slightly dumber optimizer.

When in doubt, test.

SquareCog 2009-02-24 12:09:22

DISTINCT will be faster only if you DON'T have an index (as it doesn't sort). When you do have an index and it's used, they're synonyms.

Quassnoi 2009-02-27 15:11:17

Answer 3

+1 A:

If you don't have to do any group functions (sum, average etc in case you want to add numeric data to the table), use SELECT DISTINCT. I suspect it's faster, but i have nothing to show for it.

In any case, if you're worried about speed, create an index on the column.

tehvan 2009-02-24 12:09:28

Answer 4

+1 A:

A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs.

GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates then use DISTINCT.

If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.

Learning 2009-02-24 12:11:54

Answer 5

+1 A:

It seems that the queries are not exactly the same. At least for MySQL.

Compare:

describe select distinct productname from northwind.products
describe select productname from northwind.products group by productname

The second query gives additionally "Using filesort" in Extra.

amartynov 2009-02-24 12:20:18

They are the same in terms of what they get, not in terms of how they get it. An ideal optimizer would execute them the same way, but MySQL optimizer is not ideal. Based on your evidence, it would seem that DISTINCT would go faster -- O(n) vs O(n*log n).

SquareCog 2009-02-24 15:07:28

So, "using filesort" is essentially bad thing?

vava 2009-02-25 00:17:13

In this case it is, because you don't need to sort (you would if you needed the groups). MySQL sorts in order to place the same entries together, and then get groups by scanning the sorted file. You just need distincts, so you just have to hash your keys while doing a single table scan.

SquareCog 2009-02-25 15:36:41

Answer 6

+3 A:

I Am a bit of a "Stack Overflow" Newbie so I don't yet have the reputation or points to vote up or comment, so I will just say this - Real world examples... go for the simplest and shortest if you can -- DISTINCT seems to be more what you are looking for only because it will give you EXACTLY the answer you need and only that!

];o)

Tim 2009-02-24 13:37:28

Answer 7

+1 A:

If you have an index on profession, these two are synonyms.

If you don't, then use DISTINCT.

GROUP BY in MySQL sorts results. You can even do:

SELECT u.profession FROM users u GROUP BY u.profession DESC

and get your professions sorted in DESC order.

DISTINCT creates a temporary table and uses it for storing duplicates. GROUP BY does the same, but sortes the distinct results afterwards.

So

SELECT DISTINCT u.profession FROM users u

is faster, if you don't have an index on profession.

Quassnoi 2009-02-27 14:50:08

Answer 8

A:

SELECT DISTINCT will always be the same, or faster, than a GROUP BY. On some systems (i.e. Oracle), it might be optimized to be the same as DISTINCT for most queries. On others (such as SQL Server), it can be considerably faster.

Jess 2009-02-27 14:55:01

ansaurus

tags:

views:

answers:

What's faster, SELECT DISTINCT or GROUP BY in MySQL?

related questions