views:

77

answers:

2

I am using the following query to find out to top 6 viewed pages in my Drupal site:

SELECT n.title, n.nid, c.daycount 
FROM node n 
JOIN node_counter c ON n.nid=c.nid 
WHERE n.type='page' AND n.status = 1 
ORDER BY c.daycount DESC
LIMIT 0,6;

This is very natural and works well on most sites. However, on a site with many nodes (1.7m), it comes out rather slow, and it is hardly cached, since the node table keeps changing, as users add/edit nodes in the system.

Running explain on the heavy site produces the following output:

+----+-------------+-------+--------+-----------------------------------------------+------------------+---------+------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys                                 | key              | key_len | ref              | rows  | Extra                                        |
+----+-------------+-------+--------+-----------------------------------------------+------------------+---------+------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | n     | ref    | PRIMARY,node_type,status,node_status_type,nid | node_status_type | 102     | const,const      | 71878 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | c     | eq_ref | PRIMARY                                       | PRIMARY          | 4       | kidstvprd2.n.nid |     1 | Using where                                  |
+----+-------------+-------+--------+-----------------------------------------------+------------------+---------+------------------+-------+----------------------------------------------+

Note the "Using where; Using temporary; Using filesort".

One solution I thought about is running this query offline (in cron perhaps), and saving the results in another table for anyone to read from, until the next cron run updates it. However, before reverting into cron, I'd like to try to optimize this query.

Does anybody have an idea on how to optimize it?

Thanks

+2  A: 

The problem is that it's starting with the n table rather than the c. What you want is for it to use an index on c.daycount (in order to avoid the sort), and then join that to n. If necessary, use straight_join to force the order.

See also http://dev.mysql.com/doc/refman/5.1/en/join.html

Amazing! that worked very well.I wasn't aware of the subtely of straight_join vs. join. I simply assumed mySQL does the best thing for me...
yhager
A: 

In SQLServer I would make sure to have following indexes

CREATE INDEX IX_NODE_NID_TYPE_STATUS_TITLE   
  ON dbo.Node (Nid, Type, Status) INCLUDE (Title)

CREATE INDEX IX_NODE_COUNTER_NID_DAYCOUNT 
  ON dbo.Node_Counter (Nid, DayCount)
Lieven