views:

84

answers:

1

I have this query:

SELECT weekno, 
         dt, 
         SUM(CARRYOVER)             AS CARRYOVER,
         daycode,
         COALESCE('N','N') AS FLAG,
         DATE_FORMAT(STR_TO_DATE(CONCAT(YEAR(dt), weekno, ' Sunday'), 
                     '%X%V %W') 
         , 
         '%c/%d/%Y' 
         )                         AS ENDOFTHEWEEK
  FROM   bbproduct b 
         JOIN customer c 
           ON c.custno = b.custno 
         JOIN (SELECT DISTINCT `PRODNO` 
               FROM   `familycode` 
               WHERE  SLSFAMLYCD = 104) f 
           ON f.prodno = b.prodno 
  WHERE  c.`SLSRPTGRP` = 001203 
         AND DT>=DATE_SUB(NOW(), INTERVAL 13 WEEK)
  GROUP  BY WEEKNO, DAYCODE
  ORDER  BY dt DESC; 

This query runs over 30 seconds. The problem probably it's with the bbproduct table as that holds 9 million of results, and probably the sum() and group by is slow.

The EXPLAIN shows this data:

id  select_type  table       type  possible_keys                                                                                                       key                  key_len  ref                          rows  Extra                            
1   PRIMARY      <derived2>  ALL                                                                                                                                                                                  104   Using temporary; Using filesort  
1   PRIMARY      c           ref   PRIMARY,ind_slsrptgrp                                                                                               ind_slsrptgrp        3        const                        117                                    
1   PRIMARY      b           ref   PRIMARY,in_bbproduct_custnodeptnoprodno,in_bbproduct_custno,in_bbproduct_dt,ind_prodno,ind_dt_custno_deptno_prodno  in_bbproduct_custno  3        schwebel_teststats.c.CUSTNO  1232  Using where                      
2   DERIVED      familycode  ref   PRIMARY,ind_slsfamlycd                                                                                              PRIMARY              2                                     104   Using where; Using index         

The schema is

CREATE TABLE IF NOT EXISTS `bbproduct` (
  `id` bigint(20) NOT NULL,
  `CUSTNO` decimal(6,0) unsigned zerofill NOT NULL default '000000',
  `DEPTNO` decimal(1,0) unsigned zerofill NOT NULL default '0',
  `PRODNO` decimal(4,0) unsigned zerofill NOT NULL default '0000',
  `DT` date NOT NULL,
  `ARDATE8N` decimal(8,0) unsigned zerofill NOT NULL default '00000000',
  `STORENO` char(6) NOT NULL default '0',
  `DAYCODE` decimal(1,0) unsigned zerofill NOT NULL default '0',
  `WEEKNO` decimal(2,0) unsigned zerofill NOT NULL default '00',
  `WKOFMONTH` decimal(2,0) unsigned zerofill NOT NULL default '00',
  `QUANTITY` decimal(7,0) NOT NULL default '0',
  `CARRYOVER` decimal(7,0) NOT NULL default '0',
  `SELLOFF1` decimal(7,0) NOT NULL default '0',
  `ONHANDQTY` decimal(7,0) NOT NULL default '0',
  `KNOWNLOSS` decimal(7,0) NOT NULL default '0',
  `ORDER` decimal(7,0) NOT NULL default '0',
  `FAMLYCD` decimal(4,0) NOT NULL default '0',
  `ROUTENO` decimal(4,0) NOT NULL default '0',
  `AGENCYNO3N` decimal(3,0) NOT NULL default '0',
  `RETURNS` decimal(7,0) NOT NULL,
  `LASTUPDATE` datetime NOT NULL,
  PRIMARY KEY  (`CUSTNO`,`DEPTNO`,`PRODNO`,`DT`),
  KEY `in_bbproduct_custnodeptnoprodno` (`CUSTNO`,`DEPTNO`,`PRODNO`),
  KEY `in_bbproduct_deptnocustno` USING BTREE (`DEPTNO`,`CUSTNO`),
  KEY `in_bbproduct_custno` USING BTREE (`CUSTNO`),
  KEY `in_bbproduct_dt` USING BTREE (`DT`),
  KEY `in_bbproduct_routeno` (`ROUTENO`),
  KEY `in_bbproduct_selloff1` (`SELLOFF1`),
  KEY `in_bbproduct_weekno` (`WEEKNO`),
  KEY `in_bbproduct_routenoweekno` (`WEEKNO`,`ROUTENO`),
  KEY `in_bbproduct_agencyroutedt` USING BTREE (`AGENCYNO3N`,`ROUTENO`,`DT`),
  KEY `in_bbproduct_routenoagencyno3n` (`ROUTENO`,`AGENCYNO3N`),
  KEY `in_bbproduct_agencyno` (`AGENCYNO3N`),
  KEY `in_bbproduct_storeno_prodno` (`STORENO`,`PRODNO`),
  KEY `ind_deptno_prodno` (`DEPTNO`,`PRODNO`),
  KEY `ind_prodno` (`PRODNO`),
  KEY `ind_dt_custno_deptno_prodno` (`DT`,`CUSTNO`,`DEPTNO`,`PRODNO`),
  KEY `ind_weekno_daycode` (`WEEKNO`,`DAYCODE`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;

SELECT VERSION() returns 5.0.91-community

A: 

Some improvement would come from using covering indexes.

create index cov1 on customer( SLSRPTGRP, custno )

This index would support the join from customer to bbproduct. If it works, you should see "using index" in the explain row for table c.

Doing the same for table bbproduct is a bit more involved and would eat up a lot of space:

create index cov2 on bbproduct( custno, dt, prodno, daycode, weekno )

Additionally (although involving a much bigger change to your system), you could move the date-related fields into a separate "dimension" table:

create table date_dim (
  `id` bigint(20) NOT NULL,
  `DT` date NOT NULL,
  `ARDATE8N` decimal(8,0) unsigned zerofill NOT NULL default '00000000',
  `DAYCODE` decimal(1,0) unsigned zerofill NOT NULL default '0',
  `WEEKNO` decimal(2,0) unsigned zerofill NOT NULL default '00',
  `WKOFMONTH` decimal(2,0) unsigned zerofill NOT NULL default '00',
  `ENDOFTHEWEEK` date not null,
  primary key(id),
  unique key (DT),
  key (DT, id),
  key (ARDATE8N, id),
  ...
);

bbroduct would then only contain a date_dim_id reference. The advantage there is that expensive range queries (eg. dt <= '{date}' and '{date}' <= dt ) are performed on the small date_dim table and then joined on matching date_dim_id in bbproduct.

Martin
I will create the index to see if makes sense. It will be some time, as the table is over 3Gbytes.
Pentium10
OK thanks, the first index works ok. But I don't understand the order on the 2nd one, could you please explain why you have put the fields in this order?
Pentium10
Custno definitely has to be first, but the others are in pretty much arbitrary order (although perhaps there is some advantage in dt being second given that the results will be restricted by dt). The point is to try to satisfy the data needs of the query through the index - ie. providing a covering index - so that the data records do not have to be read at all. Indexes are much faster to read than data.
Martin