ansaurus

Question

Answer 1

A:

I am not sure what the structure of your data is. When does a row become eligible for deletion? If it is a purely ID based on date based thing then you can create a new table for each day, insert your new data into the new tables and when it comes to cleaning simply drop the required tables. Then for any selects construct a view over all the tables. Just an idea.

EDIT: (In response to comments) If you are maintaining a view over all the tables then no it won't be complicated at all. The complex part is coding the dropping and recreating of the view.

I am assuming that you don't want you data to be locked down too much during deletes. Why not chunk the delete operations. Created a SP that will delete the data in chunks, 50 000 rows at a time. This should make sure that SQL Server keeps a row lock instead of a table lock. Use the

WAITFOR DELAY 'x'

In your while loop so that you can give other queries a bit of breathing room. Your problem is the old age computer science, space vs time.

uriDium 2010-06-23 10:09:49

The data already exists and it's already consumed by an application. Beside, it would make searching the remaining data rather complex.

Stephane 2010-06-23 13:10:10

@uriDium - That will still be logged though. The OP is trying to avoid the overhead from logging to make it quicker and hasn't mentioned anything about minimising impact on concurrent transactions. I presume there might not be any.

Martin Smith 2010-06-24 11:26:27

Answer 2

+1 A:

As per the comments my suggestion would be to simply dispense with the copy back step and promote the table containing records to be kept to become the new main table by renaming it.

It should be quite straightforward to script out the index/statistics creation to be applied to the new table before it gets swapped in.

The clustered index should be created before the non clustered indexes.

A couple of points I'm not sure about though.

Whether it would be quicker to insert into a heap then create the clustered index afterwards. (I guess no if the insert can be done in clustered index order)
Whether the original table should be truncated before being dropped (I guess yes)

Martin Smith 2010-06-24 11:39:54

I guess I'll be creating the clustered index first but it shouldn't really matter: it's a surrogate key anyway so doing the initial select should have created the table in the correct order already. I'll make sure by adding an ORDER BY statement, though.Why would it be necessary to truncate the table before dropping it ? Neither operation is log-backed, as far as I know.I'll try this and report the result.

Stephane 2010-06-24 12:36:07

@Stephane I simply wasn't sure on that point. I'd be interested if you confirm either way.

Martin Smith 2010-06-24 12:37:32

Answer 3

+1 A:

@uriDium -- Chunking using batches of 50,000 will escalate to a table lock, unless you have disabled lock escalation via alter table (sql2k8) or other various locking tricks.

etliens 2010-06-24 18:56:03

ansaurus

tags:

views:

answers:

Delete large portion of huge tables

related questions