ansaurus

Question

mysql simple aggregate of subquery results is slow

Answer 1

+1 A:

You probably are missing proper indexes.

EDIT:

Your query is slow because the subquerys resut dont fit into memory and temporary table on disk is being used.

So you would benefit from index on (account_id, created) which prevents it using tmp table on disk for subquery, if used

ALTER TABLE sales ADD INDEX ix_acc_cre (account_id, created)

Imre L 2010-08-27 22:37:29

why would I want to add an index if i plan on scanning the entire table? Figure that there is 70 days worth of data in the table and i'm interested in the most recent 60. I should have been more specific when I said I plan on a full-table-scan. Unless mysql uses different access paths compared to oracle...

Neil Kodner 2010-08-27 23:05:41

Right now mysql has to do lots of sorting and since indexes are sorted this can be avoided. If you **insist** on not using indexes then change you query so that `from sales` becomes `from sales use index ()` - this effectively disables use of any indexes

Imre L 2010-08-28 09:44:26

Answer 2

+1 A:

I don't see anything particularly wrong with your query. The reason why the query is slow is because it needs to use temporary tables and filesort. The only way to seriously speed up this query will be modify your MySQL settings to allocate more memory, so as to avoid using the disk for these processes. Here's a spot on article covering the pertinent settings.

Edit: Once you do this, you can also save memory by specifying an exact column to count instead of COUNT(*), and a few other minor tweaks, as some of the others have mentioned. You want to get as small a data set as necessary to make the most of your memory. But I think the overall issue won't go away unless you allocate more memory.

wuputah 2010-08-28 01:44:56

ok. I'll change my query to count the # of orders instead of the whole row. I need to look at how mysql optimizes queries a little more closely.With non-privileged access, is there any way I can check the memory settings via mysql's data-dictionary? I dont have a shell account on the box.

Neil Kodner 2010-08-28 12:08:50

You can use `SHOW GLOBAL VARIABLES` to view the current settings.

wuputah 2010-08-29 00:21:14

Answer 3

A:

An index can be useful in a full table scan if MySQL can extract the data out of the index instead of looking at the actual rows. You shouldn't need the subquery here:

SELECT COUNT(account_id) AS thecnt, 
     IF(COUNT(account_id) < 10, COUNT(account_id), 'tenormore')
   FROM sales
     WHERE created >= SUBDATE( CURRENT_DATE(), INTERVAL 60 DAY )
   GROUP BY account_id 
   ORDER BY thecnt DESC

Hope this helps.

Joshua Martell 2010-08-28 03:24:46

If your primary key starts with account_id (account_id, created, ...), the disk ordering of the data will be close to the aggregate function and the filesort won't have as much to do. You might also consider using triggers to keep a summary table up to date with order counts.

Joshua Martell 2010-08-28 03:35:29

His original outer query is grouping by count. It answers the business question, "How many customers ordered (1, 2, ..., 10 or more) items in the last 60 days?" This is different from "How many items did each customer order in the last 60 days?" which is what the inner query (and your query) returns.

wuputah 2010-08-28 06:13:32

Wuputah is exactly right-I'm trying to group by the counts in order to build the histogram. Thanks, however, for taking a shot at the question!

Neil Kodner 2010-08-28 07:59:09

ansaurus

tags:

views:

answers:

mysql simple aggregate of subquery results is slow

related questions