ansaurus

Question

Answer 1

A:

Have you tried to move the DATE_SUB(:startDate, INTERVAL 1 MONTH) outside of the statement into a variable? Do you have an index by UserClicks.Date?

Jose Chama 2010-01-13 18:39:14

Answer 2

A:

Why not just use one select statement instead of running a nested pair of selects. Right now you're essentially running two queries. Try this:

SELECT COUNT(DISTINCT UserClicks.User_ID) AS total
FROM UserClicks
WHERE (UserClicks.Date BETWEEN :startDate AND :endDate)
AND (UserClicks.Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate)

Might help if you add an index on the date column too:

ALTER TABLE  `UserClicks` ADD INDEX (  `Date` );

Parrots 2010-01-13 18:41:46

This will return not what the original query returns.

Quassnoi 2010-01-13 18:46:05

what do you mean by add an index? can you show that too?

Andrew 2010-01-13 18:46:11

@Quassnoi How are the queries going to differ, result-wise? I'm having a hard time seeing the difference. The nested ones are basically saying "get all the people between start and end date" "now from that get all the people between start date and +1 month". How is that different from just and AND operation?

Parrots 2010-01-13 18:48:38

The askers query returns count of users that clicked *both* in a month before the `start_date` and between `start_date` and `end_date`. Your query returns number of users that clicked *exactly* on `start_date`.

Quassnoi 2010-01-13 18:50:14

`@Parrots` In other words, the asker's query does a join and your query does an intersect. These are different operations.

Quassnoi 2010-01-13 18:51:30

Good catch, +1 your answer.

Parrots 2010-01-13 18:53:54

Answer 3

+2 A:

SELECT  COUNT(*) AS total
FROM    (
        SELECT  DISTINCT User_ID 
        FROM    UserClicks 
        WHERE   Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate
        ) u1
WHERE   EXISTS
        (
        SELECT  NULL
        FROM    UserClicks u2
        WHERE   u2.User_ID = u1.User_ID
                AND u2.Date BETWEEN :startDate AND :endDate
        )

Create a composite index on (User_ID, Date):

CREATE INDEX ix_userclicks_user_date ON UserClicks (User_ID, Date)

If you have few users but lots of clicks, and have a table Users, you may use the Users table instead of DISTINCT:

SELECT  COUNT(*)
FROM    Users u
WHERE   EXISTS
        (
        SELECT  NULL
        FROM    UserClicks uc1
        WHERE   uc1.UserId = u.Id
                AND uc1.Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate
        )
        AND EXISTS
        (
        SELECT  NULL
        FROM    UserClicks uc2
        WHERE   uc2.UserId = u.Id
                AND u2.Date BETWEEN :startDate AND :endDate
        )

Quassnoi 2010-01-13 18:45:33

what do I change after creating the composite index?

Andrew 2010-01-13 19:31:51

also...does the composite index need to be unique? (sorry if it's a stupid question)

Andrew 2010-01-13 19:33:25

Composite index will help the query to run faster (especially the second query)

Quassnoi 2010-01-13 19:33:38

No, it does not have to be unique. However, if it is intrinsically `UNIQUE` (that is, you cannot have two clicks from one user at the same time), you can make it `UNIQUE`.

Quassnoi 2010-01-13 19:34:39

So just by creating the index, the query runs faster?

Andrew 2010-01-13 19:40:23

`@Andrew`: yes. The second query will be the fastest.

Quassnoi 2010-01-13 19:53:53

Answer 4

A:

MySQL tends to ignore indexes when processing subqueries, so it has to process every row. How about a self-join instead? This is just off the top of my head so it may not be quite correct, but it should at least point you in the right direction.

SELECT COUNT(DISTINCT u1.User_ID) AS total
FROM   UserClicks AS u1
JOIN   UserClicks AS u2 USING (User_ID)
WHERE  u1.Date BETWEEN :startDate AND :endDate
AND    u2.Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate)

Duncan 2010-01-14 00:23:43

ansaurus

tags:

views:

answers:

How to optimize this SQL select query?

related questions