tags:

views:

291

answers:

6

Possible Duplicate:
What to do when I want to use database constraints but only mark as deleted instead of deleting?

Is it more appropriate to set some kind of "IsDeleted" flag in a heavily used database to simply mark records for deletion (and then delete them later), or should they be deleted directly?

I like the IsDeleted flag approach because it gives an easy option to restore data in case something went terribly wrong, and I could even provide some kind of "Undo" function to the user. The data I'm dealing with is fairly important.

I don't like IsDeleted because it really messes with data retrieval queries, as I'd have to filter by the state of the IsDeleted flag in addition to the regular query. Queries use no more than one index so I'd also assume that this would slow down things tremendously unless I create composite indexes.

So, what is more appropriate? Is there a better "middle way" to get the benefits of both, and what are you using & why?

+1  A: 

As a rule of thumb I never delete any data. The type of business I am in there are always questions suchas 'Of the customers that cancelled how many of them had a widget of size 4' If I had deleted the customer how could I get it. Or more likely if had deleted a widget of size 4 from the widget table this would cause a problem with referential integrity. An 'Active' bit flag seems to work for me and with indexing there is no big performance hit.

John Nolan
Very nice explanation by @John Nolan. You can set IsDeleted to disable the record so that it is not used in future but past records referring to this have valid references.
TheVillageIdiot
Do you have a single index on the IsDeleted (or Active) column, or did you make that column part of a composite index?
Alex
for a lookup table you may have a composite on primary key and active. Lookup tables may be small though and a table scan is quick enough.
John Nolan
@Alex - such index would certainly be ignored by database engine due to its low selectivity.
grigory
A: 

The answer is, it depends on the scenario.

Does it requires undo-delete?
what is the frequency of user doing that?
how many records will it result into over time?

If it is required, you can create tables with identical structure with name _DELETED suffix.
Customers__DELETED.

I think you should also consider what happens if there is a conflict when you Undelete the record and some other user has entered the record with similar content.

shahkalpesh
+1  A: 

I would be driven by business requirements. If the client expects you to restore deleted data instantly and undeleting data is part of business logic and/or use cases then isDeleted flag makes sense.

Otherwise, by leaving deleted data in the database, you address problems that are more suitable to be addressed by database backups and maintenance procedures.

grigory
A: 

Setting a flag to IsDeleted can be a good option. It enables you to keep the full history of your data, readily available. Of course, this can impact the size of your database greatly, which you have to consider.

I do believe its worth keeping the data, and as another option (middle way) you could use another database, maybe stored on another server which would be an exact copy of your live database, but you use this database to store all the deleted items from your live database.

This method will help in keeping the size of your database down. Plus you are keeping your deleted items in one area almost like a legacy system, which going forward you could use this Legacy database to run reports against, or even connect to via your application.

As you mentioned it's a heavy used database it sounds like your indexing strategy will just be as important.

In the end though it all comes down to your business requirements.

kevchadders
A: 

The mechanism for doing this has been discussed several times before.

Question 771197 Question 68323

My personal favourite, a deleted_at column is documented in Question 771197

Steve Weet
A: 

I have learnt that deleting the data rarely makes sense, as there's always some reporting that needs the data or more often some one deletes by mistake and needs it back. Personally I move all "deleted" items to an archive version of the database. This is then backed up separetly and reports can use it. The main DB Size is kept lean and restoring the data is not too much of an issue.

But like others have said, it depends on your business requirements and scale / size of DB. An archived / deleted field may be enough.

MJJames