views:

166

answers:

4

I have posts, votes, and comments tables. Each post can have N 'yes votes', N 'no votes' and N comments. I am trying to get a set of posts sorted by number of yes votes.

I have a query that does exactly this, but is running far too slowly. On a data set of 1500 posts and 15K votes, it's take .48 seconds on my dev machine. How can I optimize this?

select
  p.*,
  v.yes,
  x.no
from
  posts p
left join (select post_id, vote_type_id, count(1) as yes from votes where (vote_type_id = 1) group by post_id) v on v.post_id = p.id
left join (select post_id, vote_type_id, count(1) as no from votes where (vote_type_id = 2) group by post_id) x on x.post_id = p.id
left join (select post_id, count(1) as comment_count from comments group by post_id) p on p.confession_id = p.id
order by
  yes desc
limit
    0, 10

EDIT:

  • Votes and Comments both have a post_id FK
  • Adding an index on vote_type_id and post_id in the votes table shaved .1sec off the query execution.
A: 

Add a 'yes_count' column and use a trigger to update the vote count for each post when the vote is made. You can index this column, then it should be very fast.

Mark Byers
i like this idea, though a bit of premature optimization? I've asked my question with more clarity here: http://stackoverflow.com/questions/1765040/mysql-fetch-10-posts-each-w-vote-count-sorted-by-vote-count-limited-by-where
A: 

Use explain for checking the query execution plan so you can see why it is slow, usually it is enough to see the plan and later create appropriate indexes. The 1.5k and 15k tables are really small so that query should be much faster.

Simon
A: 

Why don't you add a column yes and no ? Rather than adding a new entry at every post, just increment the count.

If I misunderstood your database or you can't modify it, at least do you have a foreign key on votes.post_id to post.id? Foreign keys are crutial if you do any join.

Tristram Gräbener
I need to keep track of each individual vote and associate it with IP/session for rate limiting
A: 

First off, your current query shouldn't compile, as it uses p as an alias for both the comments and the posts table.

Second, you're joining votes twice: once for no, and once for yes. Using a CASE statement, you can compute the sums of both with a single join. Here's a sample query:

select
  p.*,
  sum(case when v.vote_type_id = 1 then 1 else 0 end) as yes,
  sum(case when v.vote_type_id = 2 then 1 else 0 end) as no,
  count(c.id) as comment_count
from posts p
left join votes v on v.post_id = p.id
left join comments c on c.post_id = p.id
order by yes desc
limit 0, 10

Third, you could verify that the proper foreign keys exists for the relations between posts, votes and comments. An (post_id, vote_type_id) index on the votes could also help.

Andomar