views:

77

answers:

4

What is the general "best practice" for this type of functionality.

I have a table of users a table of polls and a table of poll responses on a website. I want to load a page that loads a poll that a user hasn't yet answer.

What is the most efficient and "best" way of going about this.

Things I've tried that seems slow/not optimal:

A query with nested selects using NOT IN

SELECT p.id
FROM poll p
WHERE p.id NOT IN (
    SELECT r.pollID
    FROM responses r
    WHERE r.username = 'someuser'
)

A query that uses left joins

LEFT JOIN  responses ON ( polls.id = responses.pollID
AND responses.username =  'someuser' ) 
WHERE
responses.username IS NULL

Both of these solutions seem to scale very poorly.

Other suggestions? Open to anything outside the box. (I.E. solutions that aren't confined to just different types of mysql queries)

A: 

I think a Left Join is going to be your best bet here. however, I am confused as to why in your example you check the username, then in the where clause, see if it's null. it should look like this:

LEFT JOIN  responses ON ( polls.id = responses.pollID) 
WHERE
responses.username = 'someuser'
GSto
Won't this start to be a burden when users have answered thousands of polls? Also, I'm trying to find a poll that the user has not answered yet. Would your solution find polls that they have answered?
Brian
Because he wants to avoid retrieving rows where poll is already answered. Unless that is specifically what causes the slowness (see my answer) it's actually a perfectly fine thing to do.
DVK
A: 

The left join should work great if you give responses a compound key on pollID and username.

Though it's somewhat distressing that you're using a username as a key, as opposed to a numeric user ID.

chaos
Can you elaborate on the consequences of using the username as a key? Sorry, I lack understanding at some of this.
Brian
Wastes storage, makes operations using it a tiny bit slower, maybe (if it's a variable-width column) makes the table variable-width instead of fixed-width, which also makes things slower. And it's just kind of messy.
chaos
A: 

Solution #1:

How does a straight up left join perform for you?

LEFT JOIN  responses ON ( polls.id = responses.pollID
AND responses.username =  'someuser' )

If it performs OK, you can simply retrieve a full list of polls like this and then filter out the rows where username is not null in the caller code.

Solution #2:

Make sure your indices are OK. You should have ann index in username (or poll ID+username) in responses.

DVK
+3  A: 

Both these queries are of same efficiency as long as you have a composite index on responses (pollID, username)

If your queries are slow, this most probably means you don't have this index.

LEFT JOIN combined with IS NULL on a non-NULL column is optimized by MySQL so that it returns a row as soon as it sees there is no matching value in the right column.

You can see it in the EXPLAIN as Not exists in the Extra column.

NOT IN is optimized in the same way too, you can see it as <not exists> in the warning message provided by the EXPLAIN EXTENDED.

See this entry in my blog for more details:

Quassnoi