tags:

views:

507

answers:

10

I am designing a system and I don't think it's a good idea to give the ability to the end user to delete entries in the database. I think that way because often then end user, once given admin rights, might end up making a mess in the database and then turn to me to fix it.

Of course, they will need to be able to do remove entries or at least think that they did if they are set as admin.

So, I was thinking that all the entries in the database should have an "active" field. If they try to remove an entry, it will just set the flag to "false" or something similar. Then there will be some kind of super admin that would be my company's team who could change this field.

I already saw that in another company I worked for, but I was wondering if it was a good idea. I could just make regular database backups and then roll back if they commit an error and adding this field would add some complexity to all the queries.

What do you think? Should I do it that way? Do you use this kind of trick in your applications?

+2  A: 

There is an acceptable practice that exists in many applications (drupal's versioning system, et. al.). Since MySQL scales very quickly and easily, you should be okay.

Andrew Sledge
+10  A: 

A couple reasons people do things like this is for auditing and automated rollback. If a row is completely deleted then there's no way to automatically rollback that deletion if it was in error. Also, keeping a row around and its previous state is important for auditing - a super user should be able to see who deleted what and when as well as who changed what, etc.

Of course, that's all dependent on your current application's business logic. Some applications have no need for auditing and it may be proper to fully delete a row.

Terry Donaghe
I have no need for audit but I get what you mean. How do you feel about implementing that kind of logic just to keep track internally with no intention of auditing?
marcgg
It totally depends on how important the information is, what it's being used for and your expectations for how quickly you expect data in that table to grow. This is a tricky subject to just give a blanket answer for. However, it won't hurt to err on the side of caution and plan on not actually deleting data until use cases or experience shows you that this is hurting performance.
Terry Donaghe
+11  A: 

In one of our databases, we distinguished between transactional and dictionary records.

In a couple of words, transactional records are things that you cannot roll back in real life, like a call from a customer. You can change the caller's name, status etc., but you cannot dismiss the call itself.

Dictionary records are things that you can change, like assigning a city to a customer.

Transactional records and things that lead to them were never deleted, while dictionary ones could be deleted all right.

By "things that lead to them" I mean that as soon as the record appears in the business rules which can lead to a transactional record, this record also becomes transactional.

Like, a city can be deleted from the database. But when a rule appeared that said "send an SMS to all customers in Moscow", the cities became transactional records as well, or we would not be able to answer the question "why did this SMS get sent".

A rule of thumb for distinguishing was this: is it only my company's business?

If one of my employees made a decision based on data from the database (like, he made a report based on which some management decision was made, and then the data report was based on disappeared), it was considered OK to delete these data.

But if the decision affected some immediate actions with customers (like calling, messing with the customer's balance etc.), everything that lead to these decisions was kept forever.

It may vary from one business model to another: sometimes, it may be required to record even internal data, sometimes it's OK to delete data that affects outside world.

But for our business model, the rule from above worked fine.

Quassnoi
Interesting! Do you know if it's a common practice?
marcgg
That is certainly a common practice and an excellent use case of why you wouldn't delete some data.
Terry Donaghe
It was VERY hard to implement I should say :) I would do it only if there are things out of your control, like thousands of customers, or some kinds of highly regulated activity which you should log.
Quassnoi
I'm thinking there almost HAS to be some database patterns for this sort of thing.
Terry Donaghe
Maybe, but I didn't found any, had to reinvent everything from scratch.
Quassnoi
I choose this answer as the best one because it's the one that made me think about the way I was doing things the most. But answers like Terry's, Vincent's, Chris', Vicent's, BigBlackDog's and others were also really helpful. Thanks !
marcgg
+3  A: 

I prefer the method that you are describing. Its nice to be able to undo a mistake. More often than not, there is no easy way of going back on a DELETE query. I've never had a problem with this method and unless you are filling your database with 'deleted' entries, there shouldn't be an issue.

KyleFarris
+3  A: 

I use a combination of techniques to work around this issue. For some things adding the extra "active" field makes sense. Then the user has the impression that an item was deleted because it no longer shows up on the application screen. The scenarios where I would implement this would include items that are required to keep a history...lets say invoice and payment. I wouldn't want such things being deleted for any reason.

However, there are some items in the database that are not so sensitive, lets say a list of categories that I want to be dynamic...I may then have users with admin privileges be allowed to add and delete a category and the delete could be permanent. However, as part of the application logic I will check if the category is used anywhere before allowing the delete.

Vincent Ramdhanie
+1  A: 

I've been working on a project lately where all the data was kept in the DB as well. The status of each individual row was kept in an integer field (data could be active, deleted, in_need_for_manual_correction, historic).

You should consider using views to access only the active/historic/... data in each table. That way your queries won't get more complicated.

Another thing that made things easy was the use of UPDATE/INSERT/DELETE triggers that handled all the flag changing inside the DB and thus kept the complex stuff out of the application (for the most part).

I should mention that the DB was a MSSQL 2005 server, but i guess the same approach should work with mysql, too.

BigBlackDog
+1  A: 

Yes and no.

It will complicate your application much more than you expect since every table that does not allow deletion will be behind extra check (IsDeleted=false) etc. It does not sound much but then when you build larger application and in query of 11 tables 9 require chech of non-deletion.. it's tedious and error prone. (Well yeah, then there are deleted/nondeleted views.. when you remember to do/use them)

Some schema upgrades will become PITA since you'll have to relax FK:s and invent "suitable" data for very, very old data.

I've not tried, but have thought a moderate amount about solution where you'd zip the row data to xml and store that in some "Historical" table. Then in case of "must have that restored now OMG the world is dying!1eleven" it's possible to dig out.

Pasi Savolainen
+4  A: 

The downside to just setting a flag such as IsActive or DeletedDate is that all of your queries must take that flag into account when pulling data. This makes it more likely that another programmer will accidentally forget this flag when writing reports...

A slightly better alternative is to archive that record into a different database. This way it's been physically moved to a location that is not normally searched. You might add a couple fields to capture who deleted it and when; but the point is it won't be polluting your main database.

Further, you could provide an undo feature to bring it back fairly quickly; and do a permanent delete after 30 days or something like that.

UPDATE concerning views:

With views, the data still participates in your indexing scheme. If the amount of potentially deleted data is small, views may be just fine as they are simpler from a coding perspective.

Chris Lively
that's an interesting alternative! wouldn't it add to much complexity to the issue compared to just work with views?
marcgg
A: 

I agree with all respondents that if you can afford to keep old data around forever it's a good idea; for performance and simplicity, I agree with the suggestion of moving "logically deleted" records to "old stuff" tables rather than adding "is_deleted" flags (moving to a totally different database seems a bit like overkill, but you can easily change to that more drastic approach later if eventually the amount of accumulated data turns out to be a problem for a single db with normal and "old stuff" tables).

Alex Martelli
+2  A: 

I suggest having a second database like DB_Archives whre you add every row deleted from DB. The is_active field negates the very purpose of foreign key constraints, and YOU have to make sure that this row is not marked as deleted when it's referenced elsewhere. This becomes overly complicated when your DB structure is massive.

Mugunth Kumar
+1. Good point about IsDeleted fields somewhat negating FK integrity.
j_random_hacker