The only difference in these sql queries is the record_id param (it's how I page through entire result set). The tables are myisam. The first query performs well and the second is terribly slow. Any idea why this would be?
This query works fine
explain select r.record_id, r.oai_datestamp, r.format_id, r.status, x.xml, max(u.date_updated) as date_updated
from marcnormalization.records r,
marcnormalization.records_xml x,
marcnormalization.record_updates u
where r.record_id = x.record_id
and (r.record_id > 1802000 or 1802000 is null)
and r.record_id = u.record_id
and (u.date_updated > '1960-10-19 10:18:52.0' or '1960-10-19 10:18:52.0' is null)
and u.date_updated <= '2010-10-07 10:18:52.0'
group by u.record_id
order by u.record_id
limit 1000;
this query is super slow (creates a temporary table)
explain select r.record_id, r.oai_datestamp, r.format_id, r.status, x.xml, max(u.date_updated) as date_updated
from marcnormalization.records r,
marcnormalization.records_xml x,
marcnormalization.record_updates u
where r.record_id = x.record_id
and (r.record_id > 2202000 or 2202000 is null)
and r.record_id = u.record_id
and (u.date_updated > '1960-10-19 10:18:52.0' or '1960-10-19 10:18:52.0' is null)
and u.date_updated <= '2010-10-07 10:18:52.0'
group by u.record_id
order by u.record_id
limit 1000;
update: I've gotten through my issue by changing from
group by u.record_id
order by u.record_id
to
group by r.record_id
order by r.record_id
So, it's kind of a moot point now, but I'm still curious as to the initial question.