views:

21

answers:

1

I'm getting an index scan on a join with a unique column; it claims to be examining a large number of rows even when it's looking up just one row.

This is the query:

    select t.id, 
           t.twitter_id, 
           t.screen_name,  
           t.text     
      from tweets t 
inner join twitter_handle th on th.handle = t.screen_name 
  order by t.created_at desc 
     limit 1;

Adding/removing the limit clause doesn't change the query plan. I would expect that it would scan on the created_at index of tweets for a number of rows equal to the number in the limit clause, then do an eq_ref lookup against twitter_handle.

The query plan according to explain, however, is:

+----+-------------+-------+-------+---------------+-------------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows   | Extra                                        |
+----+-------------+-------+-------+---------------+-------------+---------+------+--------+----------------------------------------------+
|  1 | SIMPLE      | th    | index | NULL          | handle      | 32      | NULL | 100126 | Using index; Using temporary; Using filesort | 
|  1 | SIMPLE      | t     | ref   | screen_name   | screen_name | 17      | func |      2 | Using where                                  | 
+----+-------------+-------+-------+---------------+-------------+---------+------+--------+----------------------------------------------+

Note the 100126 number of rows examined for an index scan and ref=func for the second table in the join order.

This query is showing up in my slow query log and I'm fairly baffled as to why mysql is choosing to execute the query this way.

The schema for these two tables:

CREATE TABLE `twitter_handle` (
  `handle_id` int(11) NOT NULL AUTO_INCREMENT,
  `handle` varchar(30) CHARACTER SET ascii NOT NULL,
  `twitter_token_id` int(11) DEFAULT NULL,
  `name` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `twitter_user_id` int(11) unsigned DEFAULT NULL,
  `location` varchar(100) CHARACTER SET utf8 DEFAULT NULL,
  `profile_image_url` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `followers_count` int(11) DEFAULT NULL,
  `twitter_list_id` int(4) DEFAULT NULL,
  `last_update` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `bio` varchar(160) CHARACTER SET utf8 DEFAULT NULL,
  PRIMARY KEY (`handle_id`),
  UNIQUE KEY `handle` (`handle`),
  KEY `twitter_token_id` (`twitter_token_id`),
  KEY `twitter_user_id` (`twitter_user_id`)
) ENGINE=InnoDB;

CREATE TABLE `tweets` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `twitter_id` char(15) DEFAULT NULL,
  `screen_name` varchar(15) NOT NULL,
  `logged_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `text` char(200) NOT NULL,
  `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `status` enum('pending','processed','ignored','pending_delete','deleted','pending_tweet','preview') NOT NULL DEFAULT 'pending',
  `interaction_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `twitter_id_UNIQUE` (`twitter_id`),
  UNIQUE KEY `interaction_id_idx` (`interaction_id`),
  UNIQUE KEY `interaction_id` (`interaction_id`,`status`),
  KEY `screen_name` (`screen_name`,`created_at`),
  KEY `status_2` (`status`,`created_at`),
  KEY `created_at_2` (`created_at`)
) ENGINE=InnoDB;
A: 

The reason is the handle in twitter_handle is charset ascii, but screen_name in tweets is latin1!

ʞɔıu