views:

439

answers:

3
EXPLAIN SELECT
*
FROM
content_link link
STRAIGHT_JOIN
content
ON
link.content_id = content.id
WHERE
link.content_id = 1
LIMIT 10;

+----+-------------+---------+-------+---------------+------------+---------+-------+------+-------+
| id | select_type | table   | type  | possible_keys | key        | key_len | ref   | rows | Extra |
+----+-------------+---------+-------+---------------+------------+---------+-------+------+-------+
|  1 | SIMPLE      | link    | ref   | content_id    | content_id | 4       | const |    1 |       |
|  1 | SIMPLE      | content | const | PRIMARY       | PRIMARY    | 4       | const |    1 |       |
+----+-------------+---------+-------+---------------+------------+---------+-------+------+-------+

However, when I remove the WHERE, the query stops using the key (even when i explicitly force it to)

EXPLAIN SELECT
*
FROM
content_link link FORCE KEY (content_id)
STRAIGHT_JOIN
content
ON
link.content_id = content.id
LIMIT 10;

+----+-------------+---------+--------+---------------+---------+---------+------------------------+---------+-------------+
| id | select_type | table   | type   | possible_keys | key     | key_len | ref                    | rows    | Extra       |
+----+-------------+---------+--------+---------------+---------+---------+------------------------+---------+-------------+
|  1 | SIMPLE      | link    | index  | content_id    | PRIMARY | 7       | NULL                   | 4555299 | Using index |
|  1 | SIMPLE      | content | eq_ref | PRIMARY       | PRIMARY | 4       | ft_dir.link.content_id |       1 |             |
+----+-------------+---------+--------+---------------+---------+---------+------------------------+---------+-------------+

Are there any work-arounds to this?

I realize I'm selecting the entire table in the second example, but why does mysql suddenly decide that it's going to ignore my FORCE anyway and not use the key? Without the key the query takes like 10 minutes.. ugh.

+2  A: 

An index helps search quickly inside a table, but it just slows things down if you select the entire table. So MySQL is correct in ignoring the index.

In your case, maybe the index has a hidden side effect that's not known to MySQL. For example, if the inner join holds only for a few rows, an index would speed things up. But MySQL can't know that without an explicit hint.

There is an exception: when every column you select is inside the index, the index is still useful if you select every row. For example, if you have an index on LastName, the following query still benefits from the index:

select LastName from orders

But this one won't:

select * from Orders
Andomar
I'm not actually selecting the entire table though. I have a LIMIT 10 applied to the real query.
Ian
Hmmm, then maybe MySQL doesn't look at the LIMIT clause when deciding how to execute the query. I know Sql Server does change its execution plan if you use TOP.
Andomar
LIMIT 10 is just going to limit the *result set* to 10. It still has to execute the query until it hits the limit. Since the query as given doesn't ask for data limitations (no WHERE clause), it's going to run the query using the table scan method. Which, if MySQL is anything like SQL Server, is going to be hugely expensive - a read of the entire table into temp storage, then row-by-row examination of the temp data. And the LIMIT statement may not be honored even then until it's done with the temp table scan.
DaveE
The LIMIT does apply as the query is being executed. The first table is being table scanned, however it will abort part of the way through the table scan once it has found enough matching rows in the other table.
Harrison Fisk
>> The LIMIT does apply as the query is being executedGood to know. Thanks.
DaveE
+2  A: 

FORCE is a bit of a misnomer. Here's what the MySQL docs say (emphasis mine):

*You can also use FORCE INDEX, which acts like USE INDEX (index_list) but with the addition that a table scan is assumed to be very expensive. In other words, a table scan is used only if there is no way to use one of the given indexes to find rows in the table.*

Since you aren't actually "finding" any rows (you are selecting them all), a table scan is always going to be fastest, and the optimizer is smart enough to know that in spite of what you are telling them.

ETA:

Try adding an ORDER BY on the primary key once and I bet it'll use the index.

Eric Petroelje
+1 Yes, an analogy is reading a book. If I ask you to read every page in the book, there's no need to look at the table of contents or the index. You just read the book from cover to cover.
Bill Karwin
yeah, it's been too long since I worked with big tables in mysql. It's coming back to me now. :P
Ian
Seems that MySQL optimizer doesn't use the index not because it's so smart, but because the column accepts NULL values. It needs to be explicitly told that CONTENT_ID IS NOT NULL to use the index (even despite the column being used in a JOIN).
Quassnoi
A: 

Your content_id seems to accept NULL values.

MySQL optimizer thinks there is no guarantee that your query will return all values only by using the index (though actually there is guarantee, since you use the column in a JOIN)

That's why it reverts to full table scan.

Either add a NOT NULL condition:

SELECT  *
FROM    content_link link FORCE KEY (content_id)
STRAIGHT_JOIN
        content
ON      content.id = link.content_id
WHERE   link.content_id IS NOT NULL
LIMIT 10;

or mark your column as NOT NULL:

ALTER TABLE content_link MODIFY content_id NOT NULL

Update:

This is verified bug 45314 in MySQL.

Quassnoi
Doesn't the INNER JOIN already filter out NULL?
Andomar
@Andomar: Ummm... right
Quassnoi
@Andomar: no, wrong. Though is does filter the NULLs out, MySQL optimizer still need to be told that.
Quassnoi
MySQL does index NULL values. This answer is pretty much completely wrong.
Harrison Fisk
@Harrison: could you please just post the code into Query Browser and check (as I did)?
Quassnoi