views:

30

answers:

5

Hi All,

First of all, this question is in regards to PHP and MySQL

I have two database tables:

The People table:

person_id   |   field_1  |  field_2  |  other_fields... 

And the Notes table:

note_id   |  person_id  |  created_timestamp  |  other_fields... 

The People table has a one to many relationship with the Notes table...
Everytime a note is created in the Notes table, a timestamp is attached to it, also a person_id foreign key is assigned.

Now...
I need to find all people who haven't had a note against them in the last 30 days.
The way I do it now is:

  1. Get all notes from the Notes table with a distinct person_id and a created_timestamp > 'time(31*86400)' (not precise.. I Know, but suits my needs)
  2. Loop through the results and add the person_id to a temporary array $temp
  3. Get all records from the People table
  4. Loop through each record and do an in_array() comparison of the person_id with $temp

This isn't very efficient and cripples the application when there are a lot of People or Notes.

Has anyone got a better solution to this. Ideally something that can be achieved using just one SQL query.

Thanks for looking

+1  A: 

Standard solution is to use a sub-query of this form:

Select * from people where PersonID NOT in 
  (select PersonID from Notes where Created_Timestamp>...)

Another option is to do a right outer join on Notes and filter only for Notes.PersonID IS NULL which only gives you the rows that don't match on Notes.

Personally I prefer the sub-query method above which should run fairly efficiently and is easier to understand than the outer join solution.

Elemental
+2  A: 
SELECT person_id FROM People WHERE person_id NOT IN 
    (SELECT person_id FROM Note 
        WHERE created_timestamp > DATE_SUB(CURDATE(), INTERVAL 30 DAY))

This assumes that create_timestamp is of the type "DATE", "TIMESTAMP" or "DATETIME". If you use a unix timestamp here convert it to a MySQL Timestamp using FROM_UNIXTIME(created_timestamp)

halfdan
Thanks, looks promising... I use unix timestamp, so would the full query be : SELECT person_id FROM People WHERE person_id NOT IN (SELECT person_id FROM Note WHERE FROM_UNIXTIME(created_timestamp) > DATE_SUB(CURDATE(), INTERVAL 30 DAY)
pǝlɐɥʞ
+1  A: 

I believe this should work:

SELECT * FROM people WHERE person_id NOT IN (SELECT DISTINCT person_id FROM notes);

Dante617
This does not limit results to the last 30 days.
halfdan
+1  A: 

If this is critical, you could consider a denormalization: storing the timestamp of the last note in the user table, and indexing that column.

Otherwise, there's no way to avoid traversing the entire table of people, so add an index on the (person_id, timestamp) pair of the note table and use a left join or subquery:

SELECT * FROM people 
         LEFT JOIN notes ON people.person_id = notes.person_id
                        AND notes.created_timestamp < NOW() - INTERVAL 30 DAY
WHERE notes.person_id IS NULL

SELECT * FROM people
WHERE person_id NOT IN (SELECT person_id FROM notes
                        WHERE created_timestamp < NOW() - INTERVAL 30 DAY)
Victor Nicollet
+1  A: 

LEFT JOIN/IS NULL

   SELECT p.*
     FROM PEOPLE p
LEFT JOIN NOTES n ON n.person_id = p.person_id
                 AND n.created_timestamp >= DATE_SUB(NOW(), INTERVAL 30 DAY)
    WHERE n.note_id IS NULL

NOT EXISTS

SELECT p.*
  FROM PEOPLE p
 WHERE NOT EXISTS(SELECT NULL
                    FROM NOTES n
                   WHERE n.person_id = p.person_id
                     AND n.created_timestamp >= DATE_SUB(NOW(), INTERVAL 30 DAY))

NOT IN

SELECT p.*
  FROM PEOPLE p
 WHERE p.person_id NOT (SELECT n.person_id
                          FROM NOTES n
                         WHERE n.created_timestamp >= DATE_SUB(NOW(), INTERVAL 30 DAY))

Conclusion

The LEFT JOIN IS NULL is the most efficient on MySQL when the columns compared are not nullable. If the columns compared were nullable, NOT IN and NOT EXISTS are more efficient.

OMG Ponies