views:

113

answers:

3

I have a table of blog posts, each with a foreign key back to it's author. There are < 15,000 entries in this table. This query scans over 19,000 rows (per EXPLAIN), requires a filesort (that might be regular MySQL behavior), and takes over 400ms to return 5 rows. possibly because of the complicated WHERE used to check if the item is actually published.

Dearest Stack Overflow, how can I wrangle this query under control?

Note: while this criteria might be up for simplification, all of the conditions are required.

SELECT      `blog_post.id`, 
            `blog_post.title`,
            `blog_post.author_id`,
            `blog_post.has_been_fact_checked`,
            `blog_post.published_date`,
            `blog_post.ordering`,
            `auth_user.username`,
            `auth_user.email`
FROM        `blog_post` 
INNER JOIN  `auth_user` 
ON          (`blog_post`.`author_id` = `auth_user`.`id`) 
WHERE       (`blog_post`.`is_approved` = True  AND 
             `blog_post`.`has_been_fact_checked` = True  AND 
             `blog_post`.`published_date` IS NOT NULL AND 
             `blog_post`.`published_date` <= '2010-10-25 22:40:05' ) 
ORDER BY    `blog_post`.`published_date` DESC, 
            `blog_post`.`ordering` ASC, 
            `blog_post`.`id` DESC 
LIMIT 5

Aside from the PKs, I have the following indexes on the table:

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date)
idx_pub_date -> blog_post(published_date)

The output from EXPLAIN looks like this:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: blog_post
         type: ref
possible_keys: blog_post_author_id,idx_published_blog_post,idx_pub_date
          key: idx_published_blog_post
      key_len: 4
          ref: const,const
         rows: 19856
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: auth_user
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: blog.blog_post.author_id
         rows: 1
        Extra: Using index
2 rows in set (0.00 sec)

Side-note: 2010-10-25 22:40:05 is just a date generated by the code that executes this query.

Thanks so much for any & all help!

A: 

Make a view of "blog_post" with all conditions you have applied in query (where clause) and do the joining of "auth_user" direct with this view.

feel free ask if not clear. :)

seed_of_tree
In MySQL, using a view does not buy you much most of the time because the MERGE algorithm is most often used, which basically takes the requested statement and the view definition statement and cobbles them together, then runs the resulting query. MERGE is [always used if there is a LIMIT clause, like in the above](http://dev.mysql.com/doc/refman/5.0/en/view-algorithms.html), so this won't help in this case. Sorry.
Conspicuous Compiler
It can be. I am not sure about what u said. But I have used this procedure in may website and it speed up's queries.
seed_of_tree
@Carson: can u please try once as i am saying. I am sure it will improve speed.
seed_of_tree
A: 

To me it looks like the filesort may be killing the speed. If you can get the ORDER BY fields into the index that is being used you may get a speed increase. Try changing:

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date)

to

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date DESC, ordering ASC, id DESC)

A couple of after thoughts: under what circumstances do you have a null published_date, searching in a date range may be quicker? Also published_date seems to be a date time field, so when sorting do you really have so may posts each second that you require the other sorting fields?

Jaydee
`MySQL` does not support `ASC/DESC` in indexes.
Quassnoi
It may be version dependent, but the manual says 4.1 and 5.0 do. http://dev.mysql.com/doc/refman/5.0/en/create-index.html
Jaydee
I stand corrected. You can use ASC and DESC, but they are ignored. Though as is mentioned I would have thought that sorting by the date+time would be enough in most cases
Jaydee
+4  A: 

MySQL does not support ASC/DESC clauses in indexes.

You would need to create a separate column called reverse_ordering and set its value to -ordering (provided that ordering is an numeric value)

Then you could create the following index:

CREATE INDEX ix_blogpost_a_c_p_ro_id ON blog_post (is_approved, has_been_fact_checked, published_date, reverse_ordering, id)

and rewrite your query:

SELECT      `blog_post.id`, 
            `blog_post.title`,
            `blog_post.author_id`,
            `blog_post.has_been_fact_checked`,
            `blog_post.published_date`,
            `blog_post.ordering`,
            `auth_user.username`,
            `auth_user.email`
FROM        `blog_post` 
INNER JOIN  `auth_user` 
ON          `blog_post`.`author_id` = `auth_user`.`id`
WHERE       `blog_post`.`is_approved` = 1 AND 
            `blog_post`.`has_been_fact_checked` = 1 AND 
            `blog_post`.`published_date` <= '2010-10-25 22:40:05'
ORDER BY    `blog_post`.`published_date` DESC, 
            `blog_post`.`reverse_ordering` DESC, 
            `blog_post`.`id` DESC 
LIMIT 5

You may get rid of IS NULL check, since the inequality condition implies it.

Update:

You also might want to read this article:

Quassnoi