I've been profiling some queries in an application I'm working on, and I came across a query that was retrieving more rows than necessary, the result set being trimmed down in the application code.
Changing a LEFT JOIN to an INNER JOIN trimmed the result set to just what was needed, and presumably would also be more performant (since less rows are selected). In reality, the LEFT JOIN'ed query was outperforming the INNER JOIN'ed, taking half the time to complete.
LEFT JOIN: (127 total rows, Query took 0.0011 sec)
INNER JOIN: (10 total rows, Query took 0.0024 sec)
(I ran the queries multiple times and those are averages).
Running EXPLAIN on both reveals nothing that explains the performance differences:
For the INNER JOIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE contacts index NULL name 302 NULL 235 Using where
1 SIMPLE lists eq_ref PRIMARY PRIMARY 4 contacts.list_id 1
1 SIMPLE lists_to_users eq_ref PRIMARY PRIMARY 8 lists.id,const 1
1 SIMPLE tags eq_ref PRIMARY PRIMARY 4 lists_to_users.tag_id 1
1 SIMPLE users eq_ref email_2 email_2 302 contacts.email 1 Using where
For the LEFT JOIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE contacts index NULL name 302 NULL 235 Using where
1 SIMPLE lists eq_ref PRIMARY PRIMARY 4 contacts.list_id 1
1 SIMPLE lists_to_users eq_ref PRIMARY PRIMARY 8 lists.id,const 1
1 SIMPLE tags eq_ref PRIMARY PRIMARY 4 lists_to_users.tag_id 1
1 SIMPLE users eq_ref email_2 email_2 302 contacts.email 1
And the query itself:
SELECT `contacts`.*, `lists`.`name` AS `group`, `lists`.`id` AS `group_id`, `lists`.`shared_yn`, `tags`.`name` AS `context`, `tags`.`id` AS `context_id`, `tags`.`color` AS `context_color`, `users`.`id` AS `user_id`, `users`.`avatar`
FROM `contacts`
LEFT JOIN `lists` ON lists.id=contacts.list_id
LEFT JOIN `lists_to_users` ON lists_to_users.list_id=lists.id AND lists_to_users.user_id='1' AND lists_to_users.creator='1'
LEFT JOIN `tags` ON tags.id=lists_to_users.tag_id
INNER JOIN `users` ON users.email=contacts.email
WHERE (contacts.user_id='1')
ORDER BY `contacts`.`name` ASC
(The clause that I'm talking about is the last INNER JOIN on the 'users' table)
The query runs on a MySQL 5.1 database, if it makes a difference.
Does anyone has a clue on why the LEFT JOIN'ed query outperforms the INNER JOIN'ed on in this case?
UPDATE: Due to Tomalak's suggestion that the small tables I'm using were making the INNER JOIN more complex, I'd created a test database with some mock data. The 'users' table is 5000 rows, and the contacts table is ~500,000 rows. The results are the same (also the timings haven't changed which is surprising when you consider that the tables are much bigger now).
I also ran ANALYZE and OPTIMIZE on the contacts table. Didn't make any discernible difference.