tags:

views:

68

answers:

4

I have a file upload site, and I want to run a maintenance script that will run every day and delete items that haven't been accessed in a week. I log views for each day, and each item into a table:

  • hit_itemid
  • hit_date
  • hit_views

The main table actually has the files that were uploaded, for the purposes of this example, its just vid_id, vid_title thats in this table, and vid_id will equal to hit_itemid.

I have a query as follows:

 SELECT vid_id, 
        vid_title, 
        SUM(case when hit_date >= '2009-09-17' then hit_hits else 0 end) as total_hits 
   FROM videos
  LEFT JOIN daily_hits ON vid_id = hit_itemid 
  WHERE vid_posttime <= '$last_week_timestamp' AND vid_status != 3
  GROUP BY hit_itemid 
  HAVING  total_hits < 1

But this always returns a single record.

How can I rewrite this query?

A: 

first guess, may be you have to do a

GROUP BY vid_id

instead of

GROUP BY hit_itemid
najmeddine
A: 
SELECT 
   vd.vid_id, 
   vd.vid_title, 
   sum(case when dh.hit_date >= '2009-09-17' then dh.hit_views else 0 end) as total_hits 
  FROM videos vd
  LEFT JOIN daily_hits dh ON dh.hit_itemid = vd.vid_id
  WHERE vd.vid_posttime <= '$last_week_timestamp' AND vd.vid_status != 3
  GROUP BY vd.vid_id
  HAVING  total_hits < 1

This is how I would have the query... Assuming vid_posttime & vid_status are fields of table videos

halocursed
A: 

Do you definitely have data which satisfy this criteria? You're only considering rows for videos created before a certain timestamp and with a certain status -- perhaps this is limiting your result set to where only one video matches.

dcrosta
+1  A: 

An idea:

SELECT DISTINCT
    vid_id, vid_title
  FROM
    videos v
    LEFT JOIN daily_hits dh ON (
      v.vid_id = dh.hit_itemid AND dh.hit_date >= '2009-09-17'
    )
  WHERE
    v.vid_posttime <= '$last_week_timestamp' AND v.vid_status != 3
    AND dh.hit_itemid IS NULL;

Alternatively (benchmark to see which is faster):

SELECT
    vid_id, vid_title
  FROM
    videos v
  WHERE
    v.vid_posttime <= '$last_week_timestamp' AND v.vid_status != 3
    AND NOT EXISTS (
      SELECT 1 FROM daily_hits dh
        WHERE v.vid_id = dh.hit_itemid AND dh.hit_date >= '2009-09-17'
   )

I'm guessing the first form will be faster, but can't check (I don't have access to your data). Haven't tested these queries either, for the same reason.

derobert