tags:

views:

77

answers:

4

Hi,

I have a table with 10 million records, what is the fastest way to delete & retain last 30 days.

I know this can be done in event scheduler, but my worry is if takes too much time, it might lock the table for much time.

It will be great if you can suggest some optimum way.

Thanks.

A: 

Shutdown your resource, SELECT .. INTO OUTFILE, parse output, delete table, LOAD DATA LOCAL INFILE optimized_db.txt - more cheaper to re-create, than to UPDATE.

mhambra
Sharpeye500
+4  A: 

Offhand, I would:

  1. Rename the table
  2. Create an empty table with the same name as your original table
  3. Grab the last 30 days from your "temp" table and insert them back into the new table
  4. Drop the temp table

This will enable you to keep the table live through (almost) the entire process and get the past 30 days worth of data at your leisure.

Michael Todd
Depending on how available the data needs to be, another option would be to 1. Grab the last 30 days into a new table 2. Switch the two tables (so the new table is the main one) 3. Move anything that was added to the old table during the switch (probably very little).
Brendan Long
@Brendan That's certainly a more fault-tolerant solution.
Michael Todd
No this has to be automated.I can't rename the table, as its accessed by application.
Sharpeye500
+1  A: 

Not that it helps you with your current problem, but if this is a regular occurance, you might want to look into a merge table: just add tables for different periods in time, and remove them from the merge table definition when no longer needed. Another option is partitioning, in which it is equally trivial to drop a (oldest) partition.

Wrikken
+1  A: 

You could try partition tables.

PARTITION BY LIST (TO_DAYS( date_field ))

This would give you 1 partition per day, and when you need to prune data you just:

ALTER TABLE tbl_name DROP PARTITION p#

http://dev.mysql.com/doc/refman/5.1/en/partitioning.html

mluebke