views:

139

answers:

1

my query:

        SELECT content.*, activity_log.content_id FROM content
        LEFT JOIN activity_log 
        ON content.id = activity_log.content_id 
        AND sess_id = '$sess_id'
        WHERE activity_log.content_id IS NULL
        AND visibility = $visibility
        AND content.reported < ".REPORTED_LIMIT."
        AND content.file_ready = 1
        LIMIT 1

The purpose of that query is to get 1 row from the content table that has not been viewed by the user (identified by session_id), but it still returns contents that have been viewed. What is wrong? ( I have checked the table making sure that the content_ids are there)

Note: I think this is more efficient than using subqueries, thoughts?

A: 

The problem is obviously in your JOIN condition. The optimization you are using (applying a condition to the base table) makes sense when you are interested in inner join results, but in case of an outer JOIN this is read as following: find correspondences between content and activity_log rows when the content id matches and session id matches, and return nulls for activity_log rows when either logs for the content id is missing, or content id is not missing, but the session id is not the one specified. And it is hardly what you want.

Your query should should look like this:

SELECT content.*, activity_log.content_id
FROM (
SELECT *
FROM content
WHERE sess_id = '$sess_id'
    AND visibility = $visibility
    AND file_ready = 1
    AND reported < ".REPORTED_LIMIT."
) as content
LEFT JOIN activity_log 
ON content.id = activity_log.content_id 
WHERE activity_log.content_id IS NULL
LIMIT 1;

If the performance is not optimal, you can consider creating a composite index on (sess_id, visibility, fileready, reported).

newtover