Hello all, long time lurker, first question!
I am struggling to optimize this query, which selects the lowest priced items that match the chosen filters:
SELECT product_info.*, MIN(product_all.sale_price) as sale_price, product_all.buy_link
FROM product_info
NATURAL JOIN (SELECT * FROM product_all WHERE product_all.date = '2010-09-30') as product_all
WHERE (product_info.category = 2
AND product_info.gender = 'W' )
GROUP BY product_all.prod_id
ORDER BY MIN(product_all.sale_price) ASC LIMIT 13
Its explain:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 89801 | Using temporary; Using filesort |
| 1 | PRIMARY | product_info | eq_ref | PRIMARY,category_prod_id_retail_price,category_ret... | PRIMARY | 4 | product_all.prod_id | 1 | Using where |
| 2 | DERIVED | product_all | ref | date_2 | date_2 | 3 | | 144107 | |
I've tried eliminating the subquery, which intuitively seems better but in practice takes even longer:
SELECT product_info.*, MIN(product_all.sale_price) as sale_price, product_all.buy_link
FROM product_info
NATURAL JOIN product_all
WHERE (product_all.date = '2010-09-30'
AND product_info.category = 2
AND product_info.gender = 'W' )
GROUP BY product_all.prod_id
ORDER BY MIN(product_all.sale_price) ASC LIMIT 13
And its explain:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | SIMPLE | product_info | ref | PRIMARY,category_prod_id_retail_price,category_ret... | category_retail_price | 5 | const | 269 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | product_all | ref | PRIMARY,prod_id,date_2 | prod_id | 4 | equipster_db.product_info.prod_id | 141 | Using where |
Here are the tables:
CREATE TABLE `product_all` (
`prod_id` INT( 10 ) NOT NULL PRIMARY KEY ,
`ref_id` INT( 10) NOT NULL PRIMARY KEY ,
`date` DATE NOT NULL ,
`buy_link` BLOB NOT NULL ,
`sale_price` FLOAT NOT NULL
) ENGINE = MYISAM ;
CREATE TABLE `product_info` (
`prod_id` INT( 10 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`prod_name` VARCHAR( 200 ) NOT NULL,
`brand` VARCHAR( 50 ) NOT NULL,
`retail_price` FLOAT NOT NULL
`category` INT( 3 ) NOT NULL,
`gender` VARCHAR( 1 ) NOT NULL,
`type` VARCHAR( 10 ) NOT NULL
) ENGINE = MYISAM ;
My Questions:
-which query structure seems optimal?
-what indices would optimize this query?
-less importantly: how does the indexing approach change when adding or removing WHERE clauses or using a different ORDER BY, such as sorting by % off:
ORDER BY (1-(MIN(product_all.sale_price)/product_info.retail_price)) DESC
edit: both queries' natural join acts on prod_id (one record in product_info can have multiple instances in product_all, which is why they need to be grouped)