views:

274

answers:

3

I currently have PHP code that handles the logic for this because I do not know how to handle it in SQL. I want to create a stored procedure that will delete all the rows except for the 5 newest for a given config_id. IE config_id = 5 gets passed to the SP so it knows which config_id it is looking to clean up.

CREATE TABLE  `TAA`.`RunHistory` (
  `id` int(11) NOT NULL auto_increment,
  `start_time` datetime default NULL,
  `stop_time` datetime default NULL,
  `success_lines` int(11) default NULL,
  `error_lines` int(11) default NULL,
  `config_id` int(11) NOT NULL,
  `file_id` int(11) NOT NULL,
  `notes` text NOT NULL,
  `log_file` longblob,
  `save` tinyint(1) NOT NULL default '0',
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=128 DEFAULT CHARSET=utf8;

Newest will be determined by start_time, if a stop_time is null but NOT the newest it should be deleted (stop_time can be null if a run was unceremoniously killed).

+9  A: 

From http://stackoverflow.com/questions/578867/sql-query-delete-all-records-from-the-table-except-latest-n

DELETE FROM `runHistory`
WHERE id NOT IN (
  SELECT id
  FROM (
    SELECT id
    FROM `runHistory`
    ORDER BY start_time DESC
    LIMIT 5 
  ) foo
);
Luis Melgratti
@Luis Melgratti: If I have a large set of IDs, will the `IN` statement be slower than other ways of doing this? This also shows that a null stop_time is irrelevant for the purposes of cleaning up the list.
manyxcxi
@manyxcxi: slower than which other ways?
Luis Melgratti
manyxcxi
@Delan slower perhaps, but correct.
Theo
@Theo, my way is much simpler and does the same thing.
Delan Azabani
@Delan, then how come you've removed your answer?
Theo
Blair McMillan
@Theo; I realise that my answer was wrong now as I hadn't read the question correctly previously. Thank you for your understanding.
Delan Azabani
A: 
begin;
declare v_start_time datetime;
declare v_id int;
#Find the id of the newest run
select id into v_id from RunHistory where start_time = (select max(start_time) from RunHistory);
#delete null stop times except for the newest run
delete from RunHistory where stop_time is null and id != v_id;
#first row is 0... skip 0-4, show 5
select start_time into v_start_time from RunHistory order by stop_time desc limit 4,1;
delete from RunHistory where start_time < v_start_time;
end;

There you go. I suggest indexing start_time. Stop_time may or may not be worth indexing. It's probably not. You can optimize that delete statement by changing it to the following, since we'll delete anything past the first five anyway:

delete from RunHistory where stop_time is null and id != v_id order by start_time desc limit 5;
Autocracy
+1  A: 

Here's a procedure I tested on MySQL 5.1.46, but it uses no subqueries so you won't get the error about no support for LIMIT in a subquery.

CREATE PROCEDURE DeleteBut5(IN c INT) BEGIN
  DECLARE i INT;
  DECLARE s DATETIME;

  SELECT id, stop_time INTO i, s
  FROM RunHistory WHERE config_id = c
  ORDER BY stop_time DESC, id DESC
  LIMIT 4, 1;

  DELETE FROM RunHistory WHERE stop_time < s OR stop_time = s AND id < i;
END

I recommend you create this covering index:

CREATE INDEX cov ON RunHistory (config_id, stop_time, id);
Bill Karwin