tags:

views:

1020

answers:

6

I have a SQL table with news stories and Unix timestamps. I'd like to only keep the 50 newest stories. How would I write an SQL statement to delete any amount of older stories?

Edit: See my answer

+5  A: 

Blockquote

delete from table where id not in (
    select id from table 
    order by id desc 
    limit 50
)

You select the ids of the data you don't want to delete, and the you delete everything NOT IN these value...

Davide Vosti
This assumes you have sequential ID's, which is usually a fair assumption. If you don't though, you'll need to make use of a timestamp or some other sequential field in your DB to do this
Orion Edwards
Gilean
Surely the subquery should be "select id from table", not "select * from table" ?
Blorgbeard
@Gilean you should tune this query and see what kind of commands your mysql version supports. This was only an input for your question...
Davide Vosti
Looks like LIMIT isn't supported in subqueries in MySQL 5
Gilean
Shouldn't order by be on the timestamp field?
enobrev
A: 

Maybe not the most efficient, but this should work:

DELETE FROM _table_ 
WHERE _date_ NOT IN (SELECT _date_ FROM _table_ ORDER BY _date_ DESC LIMIT 50)
gizmo
Gilean
A: 

Assuming this query selects the rows you want to keep:

SELECT timestampcol FROM table ORDER BY timestampcol DESC LIMIT 49,1;

Then you could use a subquery like so:

DELETE FROM table WHERE timestampcol < ( SELECT timestampcol FROM table ORDER BY timestampcol DSEC LIMIT 49,1 )

Of course, make sure you have a backup before doing anything as potentially destructive. Note that compared to the other approaches mentioned, which use IN, this one will avoid doing 50 integer comparisons for every row to be deleted, making it (potentially) 50 times faster - assuming I got my SQL right.

Dan Udey
This doesn't work (#1093 - You can't specify target table 'entries' for update in FROM clause). See http://stackoverflow.com/questions/45494/sql-delete-cant-specify-target-table-for-update-in-from-clause
Gilean
+3  A: 

Well, it sort of looks like you can't do it in one query - someone correct me if I'm wrong. The only way I've ever been able to do this sort of thing is to first figure out the number of rows in the table i.e.

select count(*) from table;

then using the result do

delete from table order by timestamp limit result - 50;

You have to do it this way for two reasons -

  1. MySQL 5 doesn't support limit in subqueries for delete
  2. MySQL 5 doesn't allow you to select in a subquery from the same table you are deleting from.
Max Stewart
+2  A: 

If you have a lot of rows, it might be better to put the 50 rows in a temporary table then use TRUNCATE TABLE to empty the table out. Then put the 50 rows back in.

EvilTeach
Truncate resets the auto increment counter - so if your IDs are on auto-increment you would need to modify the next value of auto-increment after truncating the table so that the next ID comes after the highest value of the data you kept.
Max Stewart
Nice comment, thanks for pointing that out.
EvilTeach
+3  A: 

I ended up using two queries since MySQL5 doesn't yet support LIMIT's in subqueries

SELECT unixTime FROM entries ORDER BY unixTime DESC LIMIT 49, 1;
DELETE FROM entries WHERE unixTime < $sqlResult;
Gilean