tags:

views:

39

answers:

3

I've developed a billing summary page use mysql + php.

  • there are many users : (1M)
  • light user :each has less than 10K record :0.99M users
  • heavy user :each has about 1M record

SQL is the following :

SELECT SUM(value_a) A, SUM(value_b) B, SUM(value_c) C
FROM  daily_data_sep_2010
WHERE  user_id='<user_id>'
AND type
IN (
  'type_a',  'typeb'
 )
AND publish_date
BETWEEN  '<start_date>'
AND  '<end_date>'
GROUP BY publish_date
ORDER BY publish_date DESC 

daily_data_sep_2010 table type is MyISAM

There are several types of same Queries,but SUM(value_a) A, SUM(value_b) B, SUM(value_c) C are realy same (equal) "WHERE", "GROUP BY" conditions are not same

This screen is very slow for heavy users. Do you have any good solutions?

explain is here

|table |type |possible_keys |key |key_len |ref |rows |Extra|

|daily_data_sep_2010| ALL| PRIMARY,user_id_key,type,publish_date|||| 1059756 |Using where; Using temporary; Using filesort|

I think row size is too large for sum. so I look forward to onother solutions (Hadoop?)

A: 

You may try running your query through the explain command.

However, I'd guess that adding one of these indexes would help out (depending on how frequent rows with matching types are found):

  1. user_id, type, publish_date, value_a, value_b, value_c
  2. user_id, publish_date, type, value_a, value_b, value_c

That should allow it to consult only the index for your query (reducing the number of pages read), as well as allow an index range scan (or whatever MySQL calls it) to build up the actual data.

Another option is running a scheduled (maybe nightly?) process to build up the data for your 'heavy' users, and use that data when showing them reports.

Jonathan
Thank you replying and the explain result is added.but I think row size is too large for sum.so I look forward to onother solutions (Hadoop?)
ffffff
+1  A: 

Any difference if you create a composite index on (userid, type) and do this:

       where userid = ? and type in (a,b)
       and publish_date between...
       group by publish_date
Tim
Thank you so much . my explain is
ffffff
|daily_data_sep_2010| ref| PRIMARY,user_id_key,type|user_id|4|const| 30297|Using where; Using temporary; Using filesort|
ffffff
A: 

MySQL 5.1.3 Server supports partitioning. you can reference mysql partitioning,the url is http://dev.mysql.com/doc/refman/5.1/en/partitioning.html

Sam