ansaurus

Question

How does MySQL behave in JOIN cases where ORDER BY and LIMIT are specified and only a small number of rows need actually be JOINed?

Answer 1

+2 A:

Since the fields your are filtering and ordering on are in different tables, they cannot be covered by a single index.

If you add a STRAIGHT_JOIN clause, you force MySQL to take every record from HighScore (using index on Score), find the matching record in Game, check if it's 42 and return (or neglect) it.

Since MySQL cannot tell in advance how many records will match, it will assume the worst and just show the total number of HighScore records in the plan.

In reality, the query will stop after 50 mathing records will be returned.

Quassnoi 2010-08-18 15:00:47

That's what I was hoping would be the case. Can you point me to any documentation that confirms that's what MySQL will do?

Hammerite 2010-08-18 15:04:19

@Hammerite: this is not documented, but you can easily check it by comparing `key_read_requests` for the queries with and without `LIMIT` clauses (in `MyISAM`, of course).

Quassnoi 2010-08-18 15:14:50

See my answer below.

Hammerite 2010-08-18 16:23:01

Answer 2

A:

This answer expands upon the information given by Quassnoi. I use an answer rather than a comment in order to have more space.

I tested running the query with and without the LIMIT clause, as suggested by Quassnoi. Since I am using InnoDB rather than MyISAM, I used the following query to get the number of read requests:

select
    variable_value
from
    information_schema.GLOBAL_STATUS
where
    variable_name = 'innodb_buffer_pool_read_requests';

Before running any queries, this gave 87131. After running the query without the LIMIT clause, it gave 170381. After running the query with the LIMIT clause, it gave 175315.

So the number of read requests involved in the query without LIMIT seems to have been 170381 - 87131 = 83250, while the number of read requests involved in the query with LIMIT seems to have been 175315 - 170381 = 4934. Approximately the same numbers showed up on repeating the experiment. These numbers don't seem to correspond to rows, indeed I'm not sure what they do correspond to in terms of the data fetched*, but what they do seem to show is that verifiably less data was fetched from the disk when the LIMIT query was added. As such I'm inclined to think that Quassnoi is correct and that MySQL does indeed use a sensible strategy for fetching the limited number of rows.

The number of read requests involved in the no-LIMIT query is roughly 17 times the number from the other query, but there are much more than 17 * 50 results returned, so it doesn't seem to correspond directly to number of results.

Hammerite 2010-08-18 16:22:04

ansaurus

tags:

views:

answers:

How does MySQL behave in JOIN cases where ORDER BY and LIMIT are specified and only a small number of rows need actually be JOINed?

related questions