views:

876

answers:

3

Ok, here's a query that I am running right now on a table that has 45,000 records and is 65MB in size... and is just about to get bigger and bigger (so I gotta think of the future performance as well here):

SELECT count(payment_id) as signup_count, sum(amount) as signup_amount
FROM payments p
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND tm_completed IS NOT NULL
AND member_id NOT IN (SELECT p2.member_id FROM payments p2 WHERE p2.completed=1 AND p2.tm_completed < '2009-05-01' AND p2.tm_completed IS NOT NULL GROUP BY p2.member_id)

And as you might or might not imagine - it chokes the mysql server to a standstill...

What it does is - it simply pulls the number of new users who signed up, have at least one "completed" payment, tm_completed is not empty (as it is only populated for completed payments), and (the embedded Select) that member has never had a "completed" payment before - meaning he's a new member (just because the system does rebills and whatnot, and this is the only way to sort of differentiate between an existing member who just got rebilled and a new member who got billed for the first time).

Now, is there any possible way to optimize this query to use less resources or something, and to stop taking my mysql resources down on their knees...?

Am I missing any info to clarify this any further? Let me know...

EDIT:

Here are the indexes already on that table:

PRIMARY PRIMARY 46757 payment_id

member_id INDEX 23378 member_id

payer_id INDEX 11689 payer_id

coupon_id INDEX 1 coupon_id

tm_added INDEX 46757 tm_added, product_id

tm_completed INDEX 46757 tm_completed, product_id

+5  A: 

Those kinds of IN subqueries are a bit slow in MySQL. I would rephrase it like this:

SELECT COUNT(1) AS signup_count, SUM(amount) AS signup_amount
FROM payments p
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND NOT EXISTS
(SELECT member_id
FROM payments
WHERE member_id = p.member_id
AND completed = 1
AND tm_completed < '2009-05-01')

The check 'tm_completed IS NOT NULL' is not necessary as that is implied by your BETWEEN condition.

Also make sure you have an index on:

(tm_completed, completed)
cletus
Beat me to the punch; +1 for speed
Todd Gardner
Wow... didn't know it was just a slight change from what I had already, just replacing the "IN" for "EXISTS"... thank you!
Crazy Serb
+1  A: 

Avoid using IN with a subquery; MySQL does not optimize these well (though there are pending optimizations in 5.4 and 6.0 regarding this (see here). Rewriting this as a join will probably get you a performance boost:

SELECT count(payment_id) as signup_count, sum(amount) as signup_amount
FROM payments p
LEFT JOIN (SELECT p2.member_id
          FROM payments p2
          WHERE p2.completed=1
          AND p2.tm_completed < '2009-05-01'
          AND p2.tm_completed IS NOT NULL
          GROUP BY p2.member_id) foo
ON p.member_id = foo.member_id AND foo.member_id IS NULL
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND tm_completed IS NOT NULL

Second, I would have to see your table schema; are you using indexes?

Todd Gardner
+5  A: 

I really like both answers given so far.
Much respect to you two for your excellent responses in such a short time.

I had fun putting together this solution which does not require a subquery:

SELECT count(p1.payment_id) as signup_count
       , sum(p1.amount) as signup_amount  
  FROM payments p1
       LEFT OUTER JOIN payments p2 
       ON p1.member_id = p2.member_id
   AND
       p2.completed = 1
   AND 
       p2.tm_completed < '2009-05-01'
 WHERE p1.completed > 0
   AND
       p1.tm_completed between '2009-05-01' and '2009-05-30'
   AND
       p2.member_id is null
Adam Bernier
This technique is reliably effective especially in mysql (which has historically had trouble with subqueries).
le dorfier
I liked this answer too... apparently, when running EXPLAIN on both of the answers I picked here, I get the same performance/resource usage (about 12,000 times faster computation than while using "IN" subquery). Awesome! Thank you...
Crazy Serb