tags:

views:

75

answers:

5
+3  Q: 

Optimize SQL query

I'm trying to optimize this slow query (>2s)

SELECT COUNT(*)
FROM crmentity c, mdcalls_trans_activity_update mtu, mdcalls_trans mt
WHERE (mtu.dept = 'GUN' OR  mtu.dept = 'gun') AND
      mtu.trans_code = mt.trans_code AND
      mt.activityid = c.crmid AND
      MONTH(mtu.ts) = 2 AND
      YEAR(mtu.ts) = YEAR(NOW()) AND
      c.deleted = 0 AND
      c.smownerid = 28

This is the output when I use EXPLAIN:

id  select_type table   type    possible_keys   key key_len ref rows    Extra   
1   SIMPLE  c   index_merge PRIMARY,crmentity_smownerid_idx,crmentity_deleted_smownerid_idx,crmentity_smownerid_deleted_idx crmentity_smownerid_idx,crmentity_deleted_smownerid_idx 4,8 NULL    91  Using intersect(crmentity_smownerid_idx,crmentity_deleted_smownerid_idx); Using where; Using index
1   SIMPLE  mt  ref activityid  activityid  4   pharex.c.crmid  60  
1   SIMPLE  mtu ref dept_idx    dept_idx    5   const   1530    Using where

It's using the index I created (dept_idx) but it still takes more than 2 seconds to run the query against a dataset of 1,380,384 records. Is there another way of expressing this query in an optimal fashion?

UPDATE: Using the suggestions of David, the query is now down to a few milliseconds instead of it running more than 2 seconds (actually, 51 seconds on version 5.0 of MySQL).

+2  A: 
  1. I would rewrite query using joins. It is more clear and give optimizer better chances.
  2. MONTH(mtu.ts) = 2 AND YEAR(mtu.ts) = YEAR(NOW()) - better use mtu.ts between .. and ..
How would you rewrite this? Thanks again.
Francis
select count(*)from crmentity cinner join mdcalls_trans mt on mt.activityid = c.crmidinner join mdcalls_trans_activity_update mtu on mtu.trans_code = mt.trans_codewhere mtu.ts between '20100201' and '20100228' and (mtu.dept in ('GUN', 'gun') and c.deleted = 0 and c.smownerid = 28
Thanks for this example. I created a function in PHP to get the starting date of the month and the end date of the month and used it in the 'BETWEEN' statement.
Francis
+6  A: 

What is the most selective part of the WHERE clause? That is, which condition removes the most potential items from the result set?

I'd guess it's the mtu.ts filter. If that's true, you should also index the mtu.ts column and try to constrain on this in a way that the index can be used; for example by using the BETWEEN operator.

Other tips:

  • Attach join clauses directly to the join with JOIN ... ON (), this makes the query much easier to read, both for humans and the optimizer
  • Avoid calculating constants in the query, like YEAR(NOW())
  • Avoid functions of selected columns in the WHERE clause, like MONTH(mtu.ts). This reduces the possibilities for using indices massively.
  • Normalize your data to avoid casing problems like mtu.dept = 'GUN' OR mtu.dept = 'gun'; a single UPDATE mtu SET dept = lower(dept) and an appropriate CHECK dept = lower(dept) on the table will help avoiding such madness.
David Schmitt
A: 

Could you change the text string to a number?

graham.reeds
A: 

The most obvious solution I can see would be to change COUNT(*) to cover just a single field name, otherwise your index might be next to useless!

daz-fuller
A: 

As a general principle, a good approach to analysing problems like this is to understand the data your matching on, to appreciate its cardinality.

That is to say, order your query so that the most selective things happen first. What's more likely in your data, that dept = 'GUN' or that the userId would be 28.

Lasty, have you considered joining to MT and MTU instead of filtering ? It might make your query a lot faster as you'll be limiting the amount of data that needs the date comparisons.

Russ C
Posted too fast, basically what David Schmitt and Burnall are saying!
Russ C