ansaurus

Question

Answer 1

+5 A:

You use the range condition on the first index column which kills possibility to filter on other columns.

There is no single contiguous range in this index which would contain those and only those records that satisfy the condition.

MySQL is not able to do SKIP SCAN which would jump over the distinct values of di_date. That's why it does it's best: uses range access to filter on di_date and uses WHERE to filter on all other fields.

Either recreate the index as this (the best decision):

PRIMARY KEY  (`di_sid`,`di_type`,`di_name`,`di_date`,`di_abt`)

or, if you're unable to recreate the index, you can emulate the SKIP SCAN:

SELECT  MONTH(di.di_date) as label1, DAYOFMONTH(di.di_date) as label2, sum(di.di_num) as count , di.di_abt as abt
FROM    (
        SELECT  DISTINCT di_date
        FROM    daily_info
        WHERE   di_date > '2009-10-01' AND di_date < '2009-10-16'
        ) do
JOIN    daily_info di
ON      di.di_date <= do.di_date
        AND di.di_date>= do.di_date
        AND di_sid = 6
        AND di_type = 4
        AND di_name = 'clk-1'
GROUP BY
        DAYOFMONTH(di.di_date)
ORDER BY
        TO_DAYS(di.di_date) DESC

Make sure that Using index for group-by and Range checked for each record are present in the plan.

This condition:

di.date <= do.date
AND di.date >= do.date

is used instead of simple di.date = do.date to force the range checking.

See this article in my blog for more detailed explanation of emulating SKIP SCAN:

Emulating SKIP SCAN

Update:

The latter query actually uses an equijoin and MySQL optimizes it without the tricks.

The trick above applies only to the ranged queries, i. e. when the innermost loop should use the range access, not the ref access.

It would be useful if you had to do something like di_name <= 'clk-1'

This query should work fine:

SELECT  MONTH(di.di_date) as label1, DAYOFMONTH(di.di_date) as label2, sum(di.di_num) as count , di.di_abt as abt
FROM    (
        SELECT  DISTINCT di_date
        FROM    daily_info
        WHERE   di_date > '2009-10-01' AND di_date < '2009-10-16'
        ) do
JOIN    daily_info di
ON      di.di_date = do.di_date
        AND di_sid = 6
        AND di_type = 4
        AND di_name = 'clk-1'
GROUP BY
        DAYOFMONTH(di.di_date)
ORDER BY
        TO_DAYS(di.di_date) DESC

Make sure that di uses ref access on the whole subkey possible here, with key_len = 33

Update 2

In your query, you are using these expressions out of the GROUP BY:

MONTH(di_date)
TO_DAYS(di_date)
di_abt

The query as it is now will sum all values for the 1st, 2nd etc. for any month and year.

I. e. for the first group it will add up all values from Jan 1st, 2000, then Feb 1st, 2000, etc.

Then it will return any random value of MONTH, any random value of TO_DAYS and any random value of di_abt from each group.

Your condition now is within a single month, so it's OK now, but if your condition will span multiple months (to say nothing of years), they query will produce unexpected results.

Do you really want to group by dates?

Quassnoi 2009-10-15 15:48:15

Thank you Quassnoi.

Nir 2009-10-15 16:14:14

Inedeed I did the 1st option and got filesort.I'll try the subselect option. Should I return the index to the original one?

Nir 2009-10-15 17:58:37

You will get a filesort anyway, there is no way to get rid of it in this exact query. And I just noticed a little flaw in your query, see the post update.

Quassnoi 2009-10-15 19:55:20

Answer 2

A:

You are range-scanning the first part of the index - therefore it cannot use the subsequent parts of the index.

The way to improve this is to create another index with the fields in a different order which is more conducive to this particular query.

If your index was di_sid,di_type,di_date then it may be better.

MarkR 2009-10-15 15:48:42

ansaurus

tags:

views:

answers:

Why doesn't this index work (Mysql)

related questions