views:

508

answers:

12

Possible Duplicate:
What is the best way to delete a large number of records in t-sql?

What is the fastest way to delete massive numbers (billions) of records in SQL?

I want to delete all the records that match a simple rule like myFlag = 3.

Is it possible to add a WHERE clause to a TRUNCATE?

Another solution that could be possible is the optimization made by dropping the table indexes and re-create it after. How could I do that?

+4  A: 

Sorry it is not possible to add a WHERE clause to TRUNCATE

You will have to use a DELETE

kerchingo
+3  A: 

You can't use a WHERE with a truncate: it's all or nothing. What you can do is use a locking hint to reduce the locking overhead and potential contention.

Joel Coehoorn
A: 

Probably the only way you have would be a

DELETE FROM Table WHERE myFlag = 3

So very straightforward. Any other method might be specific to your DBMS but mass deletions are rarely on any feature list. At least for MySQL and MSSQL there is no such thing as a conditional TRUNCATE.

Joey
+1  A: 

check this out

Truncation is different than deletion. It is basically the same as saying "this table is empty." It skips the transaction log and other kinds of overhead that deletions would require. Because of this, it is all or nothing. A WHERE clause wouldn't make sense, since TRUNCATE applies to tables, not rows.

Peter Recore
Truncate doesn't skip the log, it just doesn't log each row but the pages deallocated. It's also transactionalised itself and can be rolled back bizarrely. (The initial deallocation is the IAM, not the pages)
Andrew
+2  A: 

Would it be faster to copy out the records you want to keep to a temporary table, truncate this table and then copy them back in? Bulk Insert might help here...

Jacob G
+6  A: 

If you are deleting most of the large table, save what you need, truncate, add data back...

SELECT * INTO #keep FROM bar WHERE foo <> 3

TRUNCATE TABLE bar

INSERT bar SELECT * FROM #keep
gbn
it would be faster to make #keep a permanent table "keep", and just drop the original and then rename and add constraints and indexes when you are done.
KM
And, of course, you must disable any foreign key constraint pointing to the table being swept
Rodrigo
yes, you are both correct...
gbn
@KM This is possible but first the table must be regenerated entirely, including primary key and foreign constraints, indexes, default, uniques, grants, etc. A lot of work, indeed.
Rodrigo
I dont think this option will be the best if you got millions of records
Cédric Boivin
@Rodrigo, management studio will easily generate the script to create the full table. however, you only need the basic table to copy the data into. you can add all the rest later. Just watch out for tables that refer to the original, their FKs will script with those other tables
KM
We've already tried the renaming of table and it's not fast enough.
esylvestre
@Cédric Boivin, it is not how many you have, but how many you are trying to keep after the delete. if you are keeping 2% this is great, if you are keeping 98% then this stinks!
KM
@esylvestre, I'll bet that once this huge table is dropped , and you have the new table (without any FKs, indexes exc) the rename to back to the original name will be much faster.
KM
+1  A: 

Assuming the data in records is such that

WHERE myFlag <> 3

is small...

You could drop the indexes, copy the inverse of the data you want to get rid of to a temporary table, and then TRUNCATE the original table. Then copy the inverse data from the temp table into the original table, drop the temp table, and recreate the indexes on the original table.

That may take more time than just taking a hit on the delete though.

Without more info on exactly how the data size is distributed, it's unlikely that people here can answer your question specifically unless they have ESP.

You're probably better off with finding someone at your work who's had more DBA experience and having a pow-wow on the best way to do what you're talking about.

sheepsimulator
+4  A: 

You can disable index before your delete, after your delete just enable your index. It's suppose to be more faster.

I am not sure the t-sql command

Someting like

ALTER TABLE <tablename>
DROP CONSTRAINT <pkname>

MAKE YOUR DELETE HERE

ALTER TABLE <tablename>
ADD CONSTRAINT <pkname>
PRIMARY KEY CLUSTERED/NONCLUSTERED (<col1>, <col2>,...<coln>)
END
Cédric Boivin
A: 

You cannot add a WHERE clause to a TRUNCATE ...

Depending on the percentage of the data being deleted, if you're deleting almost all of the data, you may be able to export the data you want to keep, truncate the table, then use a bulk loader to load the data back in.

You can also copy the data to save to a new table, delete the original, then rename the new table to the name of the old one, if you can't take the database down for very long.

And if those aren't options, I'd look into whatever hints you can give it to defer indexing and/or remove locking while you're doing the delete.

Joe
+2  A: 

Your idea of droping the index constrain, then adding them back again after the deletion could help. But I have no idea how to do it. Anybody here has the knowledge ?

PBelanger
management studio, script the DROP and CREATE for the table, pick thru the commands for the few you need
KM
Cedric Boivin's answer seems to be what I was aiming at. Did'nt test it yet.
PBelanger
A: 

I'm assuming this is Oracle.

As kerchingo indicated you cannot add a WHERE clause to TRUNCATE. Oracle's TRUNCATE does not really delete the records, what it does is it moves the high water mark of the table to the start. This is why it's so fast (just moving a pointer) and why you cannot COMMIT or ROLLBACK the TRUNCATE (or use the WHERE clause.)

Now, if the table's not actively being updated, and if the number of records you want to keep is a small subset of a really large table, you could:

1) Create a table as a SELECT * FROM current WHERE myFlg <> 3

2) Drop the current table

3) Rename the new table to current

4) Re-add security, triggers, indexes, etc...

This is seldom the best option, but, there are times when it may be the best.

Rhose
this is a SQL Server question, the tags may have been added after you answered
KM
But even in SQL Server the process is simliar
HLGEM
A: 

If you will only havea few records, the method of creating a new table with only the data you want to keep, dropping the old table and renaming to the oldtablename (recreating all indexes, fks, etc. can be by far the fastest way especially as far as the user is concerned.

However, if you are keeping most of the records and still deleting millions or billions, do the deletes in batches. This is often fastest and will create less locking issues for the users while the deltes are happening. See this link for details on doing this:

http://www.tek-tips.com/faqs.cfm?fid=3141

Occasionally (very occasionally) we have also had to add code to our triggers that send data to audit tables to skip the step if the delete came from the user running the long running batch process. Only do this if you are sure you will not need to have these records audited and immediately after the process is done, run a script to put the trigger back the way it was.

HLGEM