tags:

views:

84

answers:

4

I need to clean out a very bloated SQL database by deleting records that are older than two years from a number of tables. What is the most efficient way of doing this? Thanks in advance.

+1  A: 

Platform?

Schema?

Constraints, Foreign Key Constraints?

Any pattern to the names and/or data types of the columns which will identify the rows to be deleted?

Cade Roux
+3  A: 

Do you have any way to determine how "old" a record is? (i.e., is there a column in the table that represents either the age of the row or a date that can be used to calculate the age?). If so, it should be a simple

DELETE FROM Table WHERE Age > 2

For example, if you have a DateTime column called CreateDate, you could do this:

DELETE FROM Table WHERE DATEADD(year, 2, CreateDate) < getdate()
Adam Robinson
Could be quite nasty in terms of locks though.
Martin Smith
@Martin: I'm not sure that's avoidable, or even if you'd *want* it to be. I don't see how this is more "nasty" in terms of locks than any other `DELETE` or `UPDATE`, other than the fact that it involves a table scan.
Adam Robinson
Breaking deletes into batches could be better for concurrency.
Martin Smith
I do have a last updated datetime column. I am going to try that second one out and see if it is what I need.
Garrett Dumas
Martin is correct, I run into this all the time. In fact, in very large tables I've blown up the DB by filling up the transaction log. An alternate approach here might be a simple loop that deletes things a month at a time (from, say, 10 years old toward 2 years old, is 8 years or 96 chunks), executing each chunk as its own transaction.
Chris Wuestefeld
@Chris I've filled a disc up that way as well! Kind of counter intuitive the amount of space a delete can use. But @Garrett deleting 50,000-60,000 rows you shouldn't have any problems assuming you're not critically low on disc space already and haven't restricted the transaction log to a small size.
Martin Smith
+1  A: 

In addition to Adam Robinson's good answer: When performing this type of operation:

  1. Run a SELECT query with the DELETE's WHERE clause first to make sure you're getting "the right data"
  2. Do a full backup
  3. Run the thing in "off" hours so as not to affect users too much
Paul Sasik
A: 

Not sure what happened with my account as I can't check mark the solution or comment, but Adam Robinson's solution was exactly what I needed. Thanks for the quick responses to a SQL amateur.

Garrett Dumas
@Garrett - I think if you flag this post as needing moderator attention they will be able to merge your 2 accounts.
Martin Smith
Got it. Thanks.
Garrett Dumas
@Garrett: That happens to me sometimes due to a corporate web tracker which does not seem to handle AJAX very well. Do Adam right and accept his answer from home. ;-)
Paul Sasik