views:

131

answers:

3

Here is my trial:

mysql> select a.id from iask a
    ->                                  join ianswer b on a.id=b.iaskid
    ->                                  join users c on c.id=a.uid
    ->                          where (c.last_check is null or a.created>c.last_check) and c.id=1
    ->                          group by a.id;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.01 sec)

mysql> select distinct a.id from iask a
    ->                                  join ianswer b on a.id=b.iaskid
    ->                                  join users c on c.id=a.uid
    ->                          where (c.last_check is null or a.created>c.last_check) and c.id=1;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

mysql> explain extended select distinct a.id from iask a
    ->                          join ianswer b on a.id=b.iaskid
    ->                          join users c on c.id=a.uid
    ->                  where (c.last_check is null or a.created>c.last_check) and c.id=1;
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
| id | select_type | table | type  | possible_keys             | key              | key_len | ref      | rows | filtered | Extra                 |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
|  1 | SIMPLE      | c     | const | PRIMARY,i_users_lastcheck | PRIMARY          | 4       | const    |    1 |   100.00 | Using temporary       |
|  1 | SIMPLE      | a     | ref   | PRIMARY,i_iask_uid        | i_iask_uid       | 4       | const    |    1 |   100.00 | Using where           |
|  1 | SIMPLE      | b     | ref   | i_ianswer_iaskid          | i_ianswer_iaskid | 4       | bbs.a.id |    7 |   100.00 | Using index; Distinct |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
3 rows in set, 1 warning (0.00 sec)

mysql> explain extended select a.id from iask a
    ->                          join ianswer b on a.id=b.iaskid
    ->                          join users c on c.id=a.uid
    ->                  where (c.last_check is null or a.created>c.last_check) and c.id=1
    ->                  group by a.id;
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
| id | select_type | table | type  | possible_keys             | key              | key_len | ref      | rows | filtered | Extra                           |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
|  1 | SIMPLE      | c     | const | PRIMARY,i_users_lastcheck | PRIMARY          | 4       | const    |    1 |   100.00 | Using temporary; Using filesort |
|  1 | SIMPLE      | a     | ref   | PRIMARY,i_iask_uid        | i_iask_uid       | 4       | const    |    1 |   100.00 | Using where                     |
|  1 | SIMPLE      | b     | ref   | i_ianswer_iaskid          | i_ianswer_iaskid | 4       | bbs.a.id |    7 |   100.00 | Using index                     |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
3 rows in set, 1 warning (0.00 sec)
+1  A: 

They are two close but not exactly identical concepts. On most queries (including the queries you provided) they are identical, I don't believe there will be a difference and will be optimized identically (though you could do explain extended if you want to see if there was any differences) (see here on optimizing distinct and group by).

So in short, most likely the same, unless there is a difference in result set, and then you should be using the one to give you the proper result set anyway, so it doesn't matter.

Todd Gardner
Hello,Todd,I've updated with "explain extended",but seems there is no difference from "explain".
Shore
Sorry, should have been more clear, you need to do a "SHOW WARNINGS" after running the extended. See here: http://www.mysqlperformanceblog.com/2006/07/24/extended-explain/
Todd Gardner
A: 

You're posting the EXPLAIN SELECT DISTINCT, but not the EXPLAIN SELECT ... GROUP BY (and without seeing exactly how your table and indices are we can't do it for you;-). I'd expect the two to be identical, meaning no difference in performance; if there are different (and DB optimizers are quirky, especially MySQL's, so it's always best to check than to guess;-) they should indicate pretty clearly which approach is going to provide better performance.

PS: you should do your EXPLAIN runs on (possibly fake) tables that are reasonably representative of real life -- the optimizer may decide for a full table scan rather than an index because it sees too little diversity along that index on toy data, but it would use the index on real data with more diversity, for example.

Alex Martelli
+1  A: 

As Alex has mentioned, without knowing your table structure it's difficult to comment.

However, DISTINCT and GROUP BY are used for very different purposes. (A good SQL reference should explain the difference between the two.) Attempting to determine which one is more efficient is meaningless.

If I am interpreting your query correctly, you're trying to return the list of new "question" IDs since user 1 did something to change the value of last_check.

If I had created your tables, a.id would be the primary key of the iask table -- hence, every value of a.id would be unique. Therefore, neither DISTINCT nor GROUP BY have an effect on your queries because

  • There is only ever one row with the value of a.id (DISTINCT not needed); and

  • There is nothing to GROUP BY since no two rows can have the same value of a.id (GROUP BY not needed).

The only meaningful query in this circumstance is just

select a.id
from   iask a
       join ianswer b on a.id = b.iaskid
       join users c on c.id = a.uid
where  (c.last_check is null or a.created > c.last_check)
       and c.id = 1;
Convict