views:

28

answers:

2

I have the following two queries (*), which only differ in the field being restricted in the WHERE clause (name1 vs name2):

SELECT A.third_id, COUNT(DISTINCT B.fourth_id) AS num
FROM first A
JOIN second B ON A.third_id = B.third_id
WHERE A.name1 LIKE 'term%'

SELECT A.third_id, COUNT(DISTINCT B.fourth_id) AS num
FROM first A
JOIN second B ON A.third_id = B.third_id
WHERE A.name2 LIKE 'term%'

Both of the name fields have a single-column index on them. There is also an index on both third_id columns as well as fourth_id (which are all foreign keys into other tables, but it is not relevant here).

According to EXPLAIN, the first one behaves like this - which is what I want:

+----+-------------+-------+-------+---------------+----------+---------+---------------+------+-------------+
| id | select_type | table | type  | possible_keys | key      | key_len | ref           | rows | Extra       |
+----+-------------+-------+-------+---------------+----------+---------+---------------+------+-------------+
|  1 | SIMPLE      | A     | range | third_id,name | name     | 767     | NULL          | 3491 | Using where | 
|  1 | SIMPLE      | B     | ref   | third_id      | third_id | 4       | db.A.third_id |   16 |             | 
+----+-------------+-------+-------+---------------+----------+---------+---------------+------+-------------+

The second one does this, which I definitely do not want:

+----+-------------+-------+------+----------------+----------+---------+---------------+--------+-------------+
| id | select_type | table | type | possible_keys  | key      | key_len | ref           | rows   | Extra       |
+----+-------------+-------+------+----------------+----------+---------+---------------+--------+-------------+
|  1 | SIMPLE      | B     | ALL  | third_id       | NULL     | NULL    | NULL          | 507539 |             | 
|  1 | SIMPLE      | A     | ref  | third_id,name2 | third_id | 4       | db.B.third_id |      1 | Using where | 
+----+-------------+-------+------+----------------+----------+---------+---------------+--------+-------------+

What the heck is happening here? How do I make the second one behave properly (i.e. like the first one)?

(*) Actually, I don't. I have a bit more complex queries; I have eliminated extras for this post, and distilled them to the minimal queries that still exhibit the problematic behaviour. Also, names were changed to protect the guilty.

+2  A: 

Add CREATE TABLE statements to your post. A real SELECT statement would be helpful too.

1 possible reason is that name2 has a much higher percentage of values starting with "term%".

Try enforcing order of tables in query by using STRAIGHT_JOIN.

SELECT A.third_id, COUNT(DISTINCT B.fourth_id) AS num
FROM first A
STRAIGHT_JOIN second B ON A.third_id = B.third_id
WHERE A.name2 LIKE 'term%'
Naktibalda
+1  A: 

How many records is in those tables ? Check cardinality/slectivity in name2 column.

If selectivity is low try Naktibalda "STRAIGHT_JOIN" or hints http://dev.mysql.com/doc/refman/5.0/en/index-hints.html

iddqd