views:

109

answers:

6

Here's the query:

SELECT 
  count(id) AS count 
FROM `numbers` 
GROUP BY 
  MONTH(created_at), 
  YEAR(created_at) 
ORDER BY 
  YEAR(created_at), 
  MONTH(created_at)

That query throws a 'Using temporary' and 'Using filesort' when doing EXPLAIN.

Ultimately what I'm doing is looking at a table of user-submitted tracking numbers and counting the number of submitted rows a grouping the counts by month/year.

ie. In November 2008 there were 11,312 submitted rows.

UPDATE, here's the DESCRIBE for the numbers table.

id  int(11) NO PRI NULL auto_increment
tracking    varchar(255) YES  NULL 
service varchar(255) YES  NULL 
notes   text YES  NULL 
user_id int(11) YES  NULL 
active  tinyint(1) YES  1 
deleted tinyint(1) YES  0 
feed    text YES  NULL 
status  varchar(255) YES  NULL 
created_at  datetime YES  NULL 
updated_at  datetime YES  NULL 
scheduled_delivery  date YES  NULL 
carrier_service varchar(255) YES  NULL
A: 
SELECT
    count(`id`) AS count, MONTH(`created_at`) as month, YEAR(`created_at`) as year
FROM `numbers`
GROUP BY month, year
ORDER BY created_at

This will be the best you can get, as far as I can tell. I created a table with an id and a datetime column and filled it with 10000 rows. The query above uses a sub select, but it really doesn't do you any different and has the overhead of a sub select. The resulting time for mine was 0.015s and his was 0.016s.

Make sure that you have an index on created_at, this will help your initial query out. It is pretty rare to not end up with a file sort when the group by comes about, but it may be possible in other situations. MySql's docs have an article about this if you feel so inclined. I do not see how those methods can be applied here, with the information you have provided.

Kevin Peno
Kevin, I updated the post with the DESCRIBE. Thanks!
Shpigford
+1  A: 

Give this a shot:

  SELECT COUNT(x.id)
    FROM (SELECT t.id,
                 MONTH(t.created_at) 'created_month', 
                 YEAR(t.created_at) 'created_year'
            FROM NUMBERS t) x
GROUP BY x.created_month, x.created_year
ORDER BY x.created_month, x.created_year

It's not a good habit to use functions in the WHERE, GROUP BY and ORDER BY clauses because indexes can't be used.

...query throws a 'Using temporary' and 'Using filesort' when doing EXPLAIN.

From what I found, that's to be expected when using DISTINCT/GROUP BY.

OMG Ponies
I assume you did the sub query simply to reduce the number of columns in the result to count?
Kevin Peno
A: 

Make sure you have a covering index over YEAR and MONTH (that is, both fields within the same index) so that the ORDER BY component of your query can use an index. This should remove the need for a filesort, although a temporary table may still be needed to handle the grouping.

DavidWinterbottom
A: 

Whenever MySQL has to do work in memory, and that work exceeds the available amount (innodb_buffer_pool_size), it starts having to use the disk to store temporary work. You could increase the variable I mentioned, but setting it too high could cause performance problems in other areas.

If you're running a dedicated server, set it to ~50-75%.

Monkey Boson
A: 

The best method would be creating a helper column that would contain numberic values of YEAR and MONTH concatenated together:

YEAR(created_at) * 100 + MONTH(created_at)

Grouping on this column would use INDEX FOR GROUP BY.

However, you can create two helper tables, the first one containing reasonable number of years (say, from 1900 to 2100), the second one containing months (from 0 to 11), and use these tables to generate the sets:

SELECT  (
        SELECT  COUNT(*)
        FROM    numbers
        WHERE   created_at >= '1900-01-01' + INTERVAL y YEAR + INTERVAL m MONTH
                AND created_at < '1900-01-01' + INTERVAL y YEAR + INTERVAL m + 1 MONTH
        )
FROM    year_table
CROSS JOIN
        month_table
WHERE   y BETWEEN 2008 AND 2010
Quassnoi
A: 

I'm sorry, but I have to disagree with the other answers.
I think what you need is to add an index to your table, preferably a covering index.

If you add an index on the columns you are searching on (created_at) and also on the columns you want to get a result from (id) then it will be dramatically faster then before.

The reason why you are using a temp table is because you use a group by.
To speed up the group by, you can change the MySQL server settings to increase the size of the tmp table and the max heap table size so that the temp table will be in memory.

Jonathan