views:

86

answers:

2

(Using MySQL 4.1.22)

I can't get this query of mine to use an index on a large table(200k+ rows), it is doing a full table scan on it. The query takes about 1.2 seconds right now. I want to get it to less than .2 seconds if possible.

Here is my query:

SELECT st_issues.issue_id, st_issues.cat_id,st_categories.name AS cat_name, st_issues.status_id,st_statuses.name AS status_name, st_issues.priority_id,st_priorities.name AS priority_name,st_priorities.color AS color, st_issues.assigned_cid,assigned_u.firstname,assigned_u.lastname,assigned_u.screenname, message, rating, created_by_email,created_by_cid,created_by_uid,by_user.firstname AS by_firstname,by_user.lastname AS by_lastname,by_user.screenname AS by_screenname, st_issues.browser,from_url,created_by_store,created,st_issues.stamp
FROM st_issues
 JOIN st_categories ON (st_issues.cat_id=st_categories.cat_id)
 JOIN st_statuses ON (st_issues.status_id=st_statuses.status_id)
 JOIN st_priorities ON (st_issues.priority_id=st_priorities.priority_id)
 LEFT JOIN users AS assigned_u ON (assigned_u.cid=st_issues.assigned_cid)
 LEFT JOIN users AS by_user ON (by_user.uid=st_issues.created_by_uid)
 LEFT JOIN st_issue_changes ON (st_issues.issue_id=st_issue_changes.issue_id AND change_id=0)
WHERE st_issues.assigned_cid=0

The results of explain:

1, 'SIMPLE', 'st_issues', 'ALL', '', '', , '', 4, 'Using where'
1, 'SIMPLE', 'st_categories', 'eq_ref', 'PRIMARY', 'PRIMARY', 1, 'sg.st_issues.cat_id', 1, ''
1, 'SIMPLE', 'st_priorities', 'eq_ref', 'PRIMARY', 'PRIMARY', 1, 'sg.st_issues.priority_id', 1, ''
1, 'SIMPLE', 'assigned_u', 'ref', 'cid', 'cid', 8, 'sg.st_issues.assigned_cid', 1, ''
1, 'SIMPLE', 'st_statuses', 'ALL', 'PRIMARY', '', , '', 4, 'Using where'
1, 'SIMPLE', 'by_user', 'ALL', '', '', , '', 221623, ''
1, 'SIMPLE', 'st_issue_changes', 'eq_ref', 'PRIMARY', 'PRIMARY', 6, 'sg.st_issues.issue_id,const', 1, ''

Obviously the problem is with the join on 'by_user' since it isn't using an index.

Here is some of the definition of the 'users' table:

CREATE TABLE  `users` (
  `cid` double unsigned NOT NULL auto_increment,
  `uid` varchar(20) NOT NULL default '',
...
  `firstname` varchar(20) default NULL,
  `lastname` varchar(20) default NULL,
...
  PRIMARY KEY  (`uid`),
...
) ENGINE=InnoDB

Anyone have any ideas of why it is not using the primary key in the join?
Anyone have any ideas or hints of how to speed up this query more?

(I can add the table definitions of the other tables if needed/wanted)

Edit:

Here is the table definition for st_issues:

CREATE TABLE  `st_issues` (
  `issue_id` int(10) unsigned NOT NULL auto_increment,
  `cat_id` tinyint(3) unsigned NOT NULL default '0',
  `status_id` tinyint(3) unsigned NOT NULL default '0',
  `priority_id` tinyint(3) unsigned NOT NULL default '0',
  `assigned_cid` int(10) unsigned NOT NULL default '0',
  `rating` tinyint(4) default NULL,
  `created_by_email` varchar(255) NOT NULL default '',
  `created_by_cid` int(10) unsigned NOT NULL default '0',
  `created_by_uid` varchar(20) NOT NULL default '',
  `created_by_store` tinyint(3) unsigned NOT NULL default '0',
  `browser` varchar(255) NOT NULL default '',
  `from_url` varchar(255) NOT NULL default '',
  `created` datetime NOT NULL default '0000-00-00 00:00:00',
  `stamp` datetime NOT NULL default '0000-00-00 00:00:00',
  PRIMARY KEY  (`issue_id`),
  KEY `idx_create_by_cid` (`created_by_cid`),
  KEY `idx_create_by_uid` (`created_by_uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
+3  A: 

Is that the whole of the definition of the users table?

Because it says:

) ENGINE=InnoDB

whereas st_issues says:

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

If your two tables are using different collations, the two string datatypes for uid and created_by_uid are different, and MySQL must do a character set coercion before it can compare them, thus defeating your index.

It's always best to ensure you use the same character set/collation for all text in your database.

bobince
Ah, I bet that is what the problem is, thanks.
Echo
A: 

I did some testing and found the following changes helped:

  • Add index on st_issues.assigned_cid.

  • Change primary key of users table to cid instead of uid.

  • Change join condition for by_user to use cid instead of uid:

    LEFT JOIN users AS by_user ON (by_user.cid=st_issues.created_by_cid)
    

Then I got the following EXPLAIN report (though with zero rows of data):

+----+-------------+------------------+--------+---------------+--------------+---------+-------------------------------+------+-------------+
| id | select_type | table            | type   | possible_keys | key          | key_len | ref                           | rows | Extra       |
+----+-------------+------------------+--------+---------------+--------------+---------+-------------------------------+------+-------------+
|  1 | SIMPLE      | st_issues        | ref    | assigned_cid  | assigned_cid | 4       | const                         |    1 |             | 
|  1 | SIMPLE      | st_categories    | eq_ref | PRIMARY       | PRIMARY      | 1       | test.st_issues.cat_id         |    1 |             | 
|  1 | SIMPLE      | st_statuses      | eq_ref | PRIMARY       | PRIMARY      | 1       | test.st_issues.status_id      |    1 |             | 
|  1 | SIMPLE      | st_priorities    | eq_ref | PRIMARY       | PRIMARY      | 1       | test.st_issues.priority_id    |    1 |             | 
|  1 | SIMPLE      | assigned_u       | eq_ref | PRIMARY       | PRIMARY      | 8       | test.st_issues.assigned_cid   |    1 |             | 
|  1 | SIMPLE      | by_user          | eq_ref | PRIMARY       | PRIMARY      | 8       | test.st_issues.created_by_cid |    1 |             | 
|  1 | SIMPLE      | st_issue_changes | eq_ref | PRIMARY       | PRIMARY      | 8       | test.st_issues.issue_id,const |    1 | Using index | 
+----+-------------+------------------+--------+---------------+--------------+---------+-------------------------------+------+-------------+

This shows that the optimizer has selected an index for each table, which it didn't in your version of the query. I had to guess on the definition for your lookup tables.

Another thing I would suggest is to define your lookup tables st_categories and st_statuses with a natural key, the name of the category or status. Then reference that natural key from the st_issues table, instead of using a tinyint pseudokey. The advantage is that you don't have to perform those joins to get the name of the category or status; it's already in the st_issues table.

Bill Karwin