hi guys, i'm trying to get optimize an very old query that i can't wrap my head around. the result that i want to archive is that i want to recommend the visitor on a web shop what other customers have shown interest in, i.e. what else they have bought together with the product that the visitor is looking at.
i have a subquery but it's very slow, takes ~15s on ~8 000 000 rows.
the layout is that all products that are put in a users basket are kept in a table wsBasket
and separated by a basketid
(which in another table is associated with a member).
in this example i want to list all the most popular products that users have bought together with productid 427, but not list the productid 427 itself.
SELECT productid, SUM(quantity) AS qty
FROM wsBasket
WHERE basketid IN
(SELECT basketid
FROM wsBasket
WHERE productid=427) AND productid!=427
GROUP by productid
ORDER BY qty
DESC LIMIT 0,4;
any help is much appreciated! hope this makes any sense at all to at least someone :)
UPDATE 1: thanks for your comments guys here are my answers, they didn't fit in the comments-field.
Using EXPLAIN on the above query i got the fllowing. Please note, I do not have any indexes on the table (except for primary key on the id
-field), i want to modify the query to benefit from indexes and place indexes on the right keys.
+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+
| 1 | PRIMARY | wsBasket | ALL | NULL | NULL | NULL | NULL | 2821 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | wsBasket | ALL | NULL | NULL | NULL | NULL | 2821 | Using where |
+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+