views:

585

answers:

6
mysql> desc users;
+-------------+------------------+------+-----+---------+----------------+
| Field       | Type             | Null | Key | Default | Extra          |
+-------------+------------------+------+-----+---------+----------------+
| id          | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| email       | varchar(128)     | NO   | UNI |         |                |
| password    | varchar(32)      | NO   |     |         |                |
| screen_name | varchar(64)      | YES  | UNI | NULL    |                |
| reputation  | int(10) unsigned | NO   |     | 0       |                |
| imtype      | varchar(1)       | YES  | MUL | 0       |                |
| last_check  | datetime         | YES  | MUL | NULL    |                |
| robotno     | int(10) unsigned | YES  |     | NULL    |                |
+-------------+------------------+------+-----+---------+----------------+
8 rows in set (0.00 sec)


mysql> create index i_users_imtype_robotno on users(imtype,robotno);
Query OK, 24 rows affected (0.25 sec)
Records: 24  Duplicates: 0  Warnings: 0
mysql> explain select * from users where imtype!='0' and robotno is null;
+----+-------------+-------+------+------------------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys          | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+------------------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | users | ALL  | i_users_imtype_robotno | NULL | NULL    | NULL |   24 | Using where |
+----+-------------+-------+------+------------------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

But this way,it's used:

mysql> explain select * from users where imtype in ('1','2') and robotno is null;
+----+-------------+-------+-------+------------------------+------------------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys          | key                    | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+------------------------+------------------------+---------+------+------+-------------+
|  1 | SIMPLE      | users | range | i_users_imtype_robotno | i_users_imtype_robotno | 11      | NULL |    3 | Using where |
+----+-------------+-------+-------+------------------------+------------------------+---------+------+------+-------------+
1 row in set (0.01 sec)

Besides,this one also did not use index:

mysql> explain select id,email,imtype from users where robotno=1;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | users | ALL  | NULL          | NULL | NULL    | NULL |   24 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
A: 

It's because 'robotno' is potentially a primary key, and it uses that instead of the index.

Phil Carter
No,it's not primary key,see my update:)
Shore
in that case the MySQL Query Optimizer has determined that using an int(10) will be faster than using a combined index of varchar(1) and int(10) which if you thought about doing that in your head is the method you'd take also.
Phil Carter
+1  A: 
SELECT  *
FROM   users
WHERE  imtype != '0' and robotno is null

This condition is not satisified by a single contiguous range of (imtype, robotno).

If you have records like this:

imtype  robotno

$       NULL
$       1    
0       NULL
0       1
1       NULL
1       1
2       NULL
2       1

, ordered by (imtype, robotno), then the records 1, 5 and 7 would be returned, while other records wouldn't.

You'll need create this index to satisfy the condition:

CREATE INDEX ix_users_ri ON users (robotno, imptype)

and rewrite your query a little:

SELECT  *
FROM   users
WHERE  (
       robotno IS NULL
       AND imtype < '0'
       )
       OR
       (
       robotno IS NULL
       AND imtype > '0'
       )

, which will result in two contiguous blocks:

robotno imtype  

--- first block start
NULL    $
--- first block end
NULL    0
--- second block start
NULL    1
NULL    2
--- second block end
1       $
1       0
1       1
1       2

This index will also serve this query:

SELECT id, email, imtype
FROM   users
WHERE  robotno = 1

, which is not served now by any index for the same reason.

Actually, the index for this query:

SELECT  *
FROM    users
WHERE   imtype in ('1', '2')
        AND robotno is null

is used only for coarse filtering on imtype (note using where in the extra field), it doesn't range robotno's

Quassnoi
+1  A: 

BTW, if you think you know better than the optimizer, which is often the case, you can force MySQL to use a specific index by appending

FORCE INDEX (index_name) after FROM users.

Artem Russakovskii
Be sure to measure the query time before and after your manual optimization though, using a full-sized production data set. You don't want to 'pessimize' your SQL. Don't force a query plan on a database system unless you really know what you're doing.
Jim Ferrans
A: 

A database systems query planner determines whether to do an index scan or not by analyzing the selectivity of the query's where clause relative to the index. (Indexes are also used to join tables together, but you only have users here.)

The first query has where imtype != '0'. This would select nearly all of the rows in users, assuming you have a large number of distinct values of imtype. The inequality operator is inherently unselective. So the MySQL query planner is betting here that reading through the index won't help and that it may as well just do a sequential scan through the whole table, since it probably would have to do that anyway.

On the other hand, had you said where imtype ='0', equality is a highly selective operator, and MySQL would bet that by reading just a few index blocks it could avoid reading nearly all of the blocks of the users table itself. So it would pick the index.

In your second example, where imtype in ('1','2'), MySQL knows that the index will be highly selective (though only half as selective as where imtype = '0'), and it will again bet that using the index will lead to a big payoff, as you discovered.

In your third example, where robotno=1, MySQL probably can't effectively use the index on users(imtype,robotno) since it would need to read in all the index blocks to find the robotno=1 record numbers: the index is sorted by imtype first, then robotno. If you had another index on users(robotno), MySQL would eagerly use it though.

As a footnote, if you had two indexes, one on users(imtype), and the other on users(imtype,robotno), and your query was on where imtype = '0', either index would make your query fast, but MySQL would probably select users(imtype) simply because it's more compact and fewer blocks would need to be read from it.

I'm being very simplistic here. Early database systems would just look at imtype's datatype and make a very rough guess at the selectivity of your query, but people very quickly realized that giving the query planner interesting facts like the total size of the table, the number of ditinct values in each column, etc. would enable it to make much smarter decisions. For instance if you had a users table where imtype was only every '0' or '1', the query planner might choose the index, since in that case the where imtype != '0' is more selective.

Take a look at the MySQL UPDATE STATISTICS statement and you'll see that its query planner must be sophisticated. For that reason I'd hesitate a great deal before using the FORCE statement to dictate a query plan to it. Instead, use UPDATE STATISTICS to give the query planner improved information to base its decisions on.

Jim Ferrans
+2  A: 

You need an index that has robotno as the first column. Your existing index is (imtype,robotno). Since imtype is not in the where clause, it can't use that index.

An index on (robotno,imtype) could be used for queries with just robotno in the where clause, and also for queries with both imtype and robotno in the where clause (but not imtype by itself).

Check out the docs on how MySQL uses indexes, and look for the parts that talk about multi-column indexes and "leftmost prefix".

nathan
A: 

Your index is over users(imtype,robotno). In order to use this index, either imtype or imtype and robotno must be used to qualify the rows. You are just using robotno in your query, thus it can't use this index.

Marcus Griep