tags:

views:

104

answers:

4

I'm running into an interesting argument with a DBA who insists that the DELETE command should not be used in T-SQL, that instead the proper way to remove data is to move the data you'd like to keep to a temp table, and drop and re-create the original table. The justification given for this is that it should prevent index fragmentation issues.

Has anyone out there heard of this as a practice, and can anyone suggest reasons why this might be a good idea? This is a fairly complex structure, and we are generally talking about small numbers of selective deletions (figure, less than 1000 rows at a time) from tables that are intended to aggregate data indefinitely.

I can't imagine a reason to do this rather than to simply reorganize/rebuild indexes where appropriate, but I would be happy to be educated if I'm missing something. Thanks!

+3  A: 

I would disagree with the DBA in this case. I would not like the idea of moving my "good" data out of the physical structure into a temp table and then back into the newly created structure once it is recreated, just to try to reduce index fragmentation. The solution is to delete the data with the Delete statment, and if fragmentation becomes a problem, reindex at that point.

Irwin M. Fletcher
+6  A: 

Rubbish, unless you are deleting 99% of rows.

There are some aspects such as the transaction log effects, but dropping a table, keys, indexes to recreate is far more complex and error prone.

Don't you index maintenance which would defrag anyway...?

gbn
+1  A: 

You SHOULD be running at least weekly a maintenance plan to defragment indexes and update statistics, which would make his argument completely moot.

Then you have the overhead of creating new indexes, probably during the middle of the work day. Assuming any kind of load at all and a decently sized table that will impact your performance far more than some index fragmentation ever will.

Very bad idea.

Donnie
+1  A: 

I agree with everyone else (the first three posts, at least). However, IT is a universe made up of "it depends", so I'll play some devil's advote.

There are circumstances--extreme circumstances--where this could make sense.

  • If you were updating the database with new structures and a mandate to clear out old data, sure -- though of course you'd have to do this during a maintenance (outage) window.
  • If you were deleting all but a very few rows out of a massively large data set (and that's with regards to storage space, not merely number of rows) -- but, as per below, there are better options
  • If there was so much data and so little hard drive space you were in danger of running out due to transaction log file bloat
  • If you didn't have to worry about foreign keys
  • If you could take the database offline for the duration of the process. Failing that, if you could ensure that locking the table or tables affected for a minute or more would not cause undue disruption of service. (Can you ever truly be 100% certain that, at the point in time you want to do the work, no one and no [scheduled] thing is actually going to use your database?)
  • If "chunkifying" the deletes (e.g. DELETE TOP 1000...) was somehow not an option
  • If TRUNCATE TABLE was not an option
  • If table partioning-based solutions were not an option

Then, maybe, it might make sense. But probably not.

To mention, I could almost see a DBA hitting situations based on some of the above early in their career, and thereafter using their "lessons learned" in all similar situations.

Philip Kelley