views:

111

answers:

7

Suposse I have the next MySQL database with 500.000 rows:

users
{ 
    id       - int, 
    name     - varchar(32), 
    verified - tinyint(1)
}

primary { id }
index   { verified }

And I need to get last 20 not verified users, so I use the next query:

SELECT * FROM users WHERE verified != 1 ORDER BY id DESC LIMIT 20

But it takes 1.2 seconds to complete.

How can I optimize it? Or get the same result with other way in php.

[EDIT]

ID is the primary index, VERIFIED is a index too

[EDIT 2]

CREATE TABLE `users` (
  `id` int(10) unsigned NOT NULL auto_increment COMMENT 'Identificador del usuario',
  `login` varchar(32) NOT NULL COMMENT 'Login para entrar',
  `password` varchar(32) NOT NULL COMMENT 'Password para entrar',
  `email` varchar(384) NOT NULL COMMENT 'Email del usuario',
  `group_id` int(10) unsigned default NULL,
  `display_name` varchar(64) NOT NULL COMMENT 'Nombre para mostrar',
  `email_verified` tinyint(3) unsigned default '0' COMMENT 'Email verificado?',
  `banned` tinyint(3) unsigned default '0' COMMENT 'Baneado?',
  `admin` tinyint(3) unsigned default '0' COMMENT 'Es un super administrador del sitio?',
  `registered` int(10) unsigned NOT NULL COMMENT 'Fecha del registro',
  PRIMARY KEY  (`id`),
  KEY `login` (`login`),
  KEY `password` (`password`),
  KEY `email` (`email`(333)),
  KEY `group_id` (`group_id`),
  KEY `email_verified` (`email_verified`),
  KEY `banned` (`banned`),
  KEY `admin` (`admin`),
  KEY `registered` (`registered`)
) ENGINE=MyISAM AUTO_INCREMENT=500002 DEFAULT CHARSET=utf8;

[EDIT 3]

EXPLAIN(SELECT id FROM users WHERE email_verified != 1 ORDER BY id DESC LIMIT 20)

is

id: 1   
select_type: SIMPLE 
table: users    
type: range 
possible_keys: email_verified   
key: email_verified 
key_len: 2      
ref:
rows: 345195    
Extra: Using where; Using filesort

And a profile of the query:

Status  Duration
(initialization)    0.0000307
Opening tables  0.000003
System lock 0.0000017
Table lock  0.0000042
init    0.000017
optimizing  0.0000077
statistics  0.000097
preparing   0.000054
executing   0.0000007
Sorting result  1.2321507
Sending data    0.000272
end 0.000004
query end   0.0000025
freeing items   0.0000099
closing tables  0.0000025
logging slow query  0.0000005
A: 

Add an index on verified, and use the following query instead:

SELECT * FROM users WHERE verified = 0 ORDER BY id DESC LIMIT 20

(assuming verified can be 0 or 1).

If you use <> instead of = MySQL will ignore indexes and perform a full table scan, slowing down your query dramatically.

quantumSoup
same result with = 0, 1.2 seconds
Wiliam
+1  A: 

Add a new index: {verified, id}. otherwise you will have to do a full table scan.

Edward
It will do a full table scan regardless if he keeps the `<>` operator.
quantumSoup
This was a valid solution, but I haven't understood this at first time. Thanks.
Wiliam
@quantumSoup: It works with <>
Wiliam
A: 

Edit: Your query is slow when you sort the result - this can be caused by the index if the given order is incorrect. MySQL defaults to ASC sort order.

See mysql manual: http://dev.mysql.com/doc/refman/5.0/en/create-index.html

An index_col_name specification can end with ASC or DESC. These keywords are permitted for future extensions for specifying ascending or descending index value storage. Currently, they are parsed but ignored; index values are always stored in ascending order. 

Please check this links:

http://www.ozzu.com/programming-forum/mysql-index-cardinality-t71876.html

http://stackoverflow.com/questions/755569/why-does-the-cardinality-of-an-index-in-mysql-remain-unchanged-when-i-add-a-new-i

You might need to optimize your index:

OPTIMIZE TABLE users;

It might make sense to have an Index with id and verified combined.

You have a lot of Indexes - this might confuse mysql.

Andreas Rehm
OPTIMIZE didn't worked, therefore cardinality of verified index is 2
Wiliam
Ah, there are lots of index because I use all them.
Wiliam
A: 

Make another table with no verified users. This is the last solution I want use.

Wiliam
+3  A: 

You need an index that encompasses both id and email_verfied

KEY `foo` (`id`,`email_verified`)

Then EXPLAIN(SELECT id FROM users WHERE email_verified != 1 ORDER BY id DESC LIMIT 20) prints

+----+-------------+-------+-------+----------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys  | key     | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+----------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | users | index | email_verified | Index 4 | 6       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+----------------+---------+---------+------+------+--------------------------+

Especially the slow Using filesort is gone.

VolkerK
Now is a sub second query :) Why this worked? How works this index with 2 columns?
Wiliam
AH, this only works if you delete the previous "email_verified" index.
Wiliam
this is fascinating. but completely contradicts my intuition about how indices work. Volker, can you maybe explain why two separate indices didn't make it?
Nicolas78
nicolas78: No, I gave up trying to understand when and why "index merge" does or doesn't works in MySQL. But take a look at: http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html and http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
VolkerK
@VolkerK: `index_intersect` can combine non-`PK` fields with `=` predicate and a `PK` field with a ranged (`<`, `=`, `>`) predicate. It does not work with `ORDER BY` (always results in a `filesort`).
Quassnoi
Quassnoi: And that is always true? Usually a restriction to PK (and not just any unique key) means that it applies to InnoDB but not to engine x,y and z.
VolkerK
A: 

Like quantumSoup's answer, but instead have an index on (verified, id) (Assuming you're assuming id is forever increasing)

SELECT * FROM users WHERE verified = 0 ORDER BY id DESC LIMIT 20

I am also assuming that verified is just 0,1

Don't use the <> as that defeats the index. Put "verified" first in the index so that it can do a simple range scan.

MarkR
Why VERIFIED before ID?
Wiliam
Because you want to make the first part of the index the constant expression ( = 0) so that it can do a range scan.
MarkR
A: 

If only possible values for verified are 0 and 1, create an index on

(verified, id)

and rewrite your query:

SELECT  *
FROM    users
WHERE   verified = 0
ORDER BY
        id DESC
LIMIT 20

This solution uses a single index both for filtering and for ordering (and works well regardless of your data distribution).

Quassnoi