views:

34

answers:

1

Hi, This query pops up in my slow query logs:

SELECT
  COUNT(*)                 AS ordersCount,
  SUM(ItemsPrice + COALESCE(extrasPrice, 0.0)) AS totalValue,
  SUM(ItemsPrice)          AS totalValue,
  SUM(std_delivery_charge) AS totalStdDeliveryCharge,
  SUM(extra_delivery_charge) AS totalExtraDeliveryCharge,
  this_.type               AS y5_,
  this_.transmissionMethod AS y6_,
  this_.extra_delivery     AS y7_
FROM orders this_
WHERE this_.deliveryDate BETWEEN '2010-01-01 00:00:00' AND '2010-09-01 00:00:00'
    AND this_.status IN(1, 3, 2, 10, 4, 5, 11)
    AND this_.senderShop_id = 10017
GROUP BY this_.type, this_.transmissionMethod, this_.extra_delivery
ORDER BY this_.deliveryDate DESC;

The table is InnoDB and has about 880k rows and takes between 9-12 seconds to execute. I tried adding the following index ALTER TABLE orders ADD INDEX _deliverydate_senderShopId_status ( deliveryDate , senderShop_id , status, type, transmissionMethod, extra_delivery); with no practical gains. Any help and/or suggestion is welcomed

Here is the query execution plan right now:

id      select_type   table type    possible_keys   key                  key_len   ref    rows    filtered  Extra
1       SIMPLE        this_ ref                     FKC3DF62E57562BA6F   8         const  139894  100.00    Using where; Using temporary; Using filesort

I took out the possible_keys value out of the text because i think it listed all the indexes in the table. The key used (FKC3DF62E57562BA6F) looks like

Keyname               Type   Unique  Packed  Field          Cardinality Collation   Null    Comment
FKC3DF62E57562BA6F    BTREE  No      No      senderShop_id  4671        A
+1  A: 

I'll tell you one thing that you can look at for increasing the speed.

You only generally have NULL values in the data for either unknown or non-applicable rows. It appears to me that, since you're treating NULL as 0 anyway, you should think about getting rid of them and making sure that all extrasPrice values are 0 where they were previously NULL so that you can get rid of the time penalty of the coalesce.

In fact, you could go one step further and introduce another column called totalPrice which you set with an insert/update trigger to the actual value ItemsPrice + extrasPrice or (ItemsPrice + COALESCE(extrasPrice,0.0) if you still need nullability of extrasPrice).

Then, you can simply use:

SELECT
    COUNT(*)          AS ordersCount,
    SUM(totalPrice)   AS totalValue,
    SUM(ItemsPrice)   AS totalValue2,
    :

(I'm not sure you should have two output columns with the same name or whether that was a typo, that's going to be, at worst, an error, at best, confusing).

This moves the cost of the calculation to insert/update time rather than select time and amortises that cost over all the selects - most database tables are read far more often than written. The consistency of the data is maintained due to the trigger and the performance should be better, at the cost of some storage requirements.

But, since the vast majority of database questions are "How can I get more speed?" rather than "How can I use less disk?", that's often a good idea.

Another suggestion is to provide a non-composite index on the column that reduces your result set the fastest (high cardinality). In other words, if you store only two weeks worth of data (14 different dates) in your table but 400 different shops, you should have an index on senderShop_id and make sure your statistics are up to date.

This should cause the DBMS execution engine to whittle down the result set using that key so that subsequent operations are faster.

A composite index on deliveryDate,senderShop_id,... will not be able to use senderShop_id to whittle down the results because the key ordering will be senderShop_id within deliveryDate.

paxdiablo