views:

208

answers:

2

Is there any way I can prohibit MySQL from performing a full table scan when the result was not found using indexes?

For example this query:

SELECT *
FROM a
WHERE (X BETWEEN a.B AND a.C) 
ORDER BY a.B DESC 
LIMIT 1;

Is only efficient if X satisfies the condition and there is at least 1 row returned, but if the condition cannot be satisfied by any data in the table, full scan will be performed, which can be very costly.

I don't want to optimize this particular query, it is just an example.

EXPLAIN on this query with X in or outside of range:

id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a range long_ip  long_ip 8 \N 116183 100.00 Using where

STATUS VARIABLE show much better information. For X outside of range:

Handler_read_prev 84181
Key_read_requests 11047

In range:

Handler_read_key 1
Key_read_requests 12

If only there was a way to prevent Handler_read_prev from ever growing past 1.

UPDATE. I can't accept my own answer, because it doesn't really answer the question (HANDLER is a great feature, though). It seems to me that there is no general way to prevent MySQL from doing a full scan. Although, simple conditions like key='X' will be considered as "impossible where", more complex things like BETWEEN will not.

+1  A: 

You could write a "fully covered" subquery that only uses data that is available in indexes. Based on the returned primary key, you can look up the rows in the master table.

The following query is fully covered by indexes on (id), (B,id), and (C,id):

select *
from a
where id in (
    select id
    from a 
    where x <= C
    and id in (
        select id
        from a
        where B <= X 
    )
)
limit 1

Each SELECT uses one index: the innermost the index on (B,id); the middle SELECT uses the index on (C,id), and the outer SELECT uses the primary key.

Andomar
LIMIT and IN are not supported, besides this query will be terribly slow and I don't see how my query is not covered by indexes. In fact, it is and the question was, how to stop MySQL when it hasn't found the answer in indexes (which it should be doing in the first place).
HeavyWave
You're right about limit in a subquery, moved it to the outer query. My proposed query will not do a table scan when no results are found, because the middle select will be empty. It should be pretty fast, have you tried it?
Andomar
Yes, it runs for more than a second on 10 mb table. The 2 inner queries return thousands of rows.
HeavyWave
If you created the indexes, it should be as fast as it gets. If there are additional constraints you can do far better. For example, if B and C never overlap between rows, you can search for the first B below X. That would be truly fast.
Andomar
Have you tried it? It is very slow with any indexes on MySQL 5.
HeavyWave
A: 

Here is what I came up with in the end:

HANDLER a OPEN;
HANDLER a READ BC <= (X);
HANDLER a CLOSE;

BC is the name of key (B,C). If we order the table by B DESC, then the result is guranteed to be equal to

SELECT *
FROM a
WHERE (X BETWEEN a.B AND a.C) 
ORDER BY a.B DESC 
LIMIT 1;

Now if X is not in the range of the table a, we just have to check that a.C is greater than X, if it's not, than X is definitely outside of the range and we don't need to look any further.

This is not very elegant though, and you will have to resort the table on each insert or update.

HeavyWave