views:

334

answers:

4

On a site that has a fair share of user-generated content such as forum threads, blog comments, submitted articles, private and public messaging, user profiles, etc; what is the best practice as far as what to do with the user-generated data if a user terminates their account?

I'm not asking legal advice and I don't view this as a legal question so much as a question of striking a balance between the user, other users, and the site because terms of use can be drawn up after that balance is struck. Some of the following scenarios should be considered when a user deletes their account:

  • Private messages between users - Should the conversation trail be deleted? If so, how do you account for cases of harassment where legal evidence is needed?
  • Forum questions or answers - If the user asked a question, should the entire thread be deleted? If they answer a question, should the answer be deleted?

I'm asking this question as I'm implementing user accounts into a CMS. I know that Facebook recently ran into trouble with their changes in their terms of use, but how do you balance a desire to delete with the needs and investment of the other users who also participated?

+7  A: 

Generally speaking with databases you rarely delete anything. You can mark it as deleted but generally speaking you keep it in your database at least for a time.

There are many reasons for this. Some of them are legal. You may have requirements ot keep data for a given period. Some of them are technical. Sometimes its just a safeguard. You may need to restore the information. The user may request their account is reopened or it may have been locked due to spamming but that was because the account had been compromised and has now been restored.

Old data may be deleted or archived but this may take months or even years.

Personally I just give relevant data a status column (eg 1 = active, 0 = deleted) and then just change the status rather than delete it 99% of the time.

Data integrity is another issue here. Let me give you an example.

Assume you have two entities:

User: id, nick, name, email
Message: id, sender_id, receiver_id, subject, body

You want to delete a particular User. What do you do about messages they've sent and received? Those messages will appear in someone else's inbox or sent items so you can't delete them. Do you set the relevant field in Message to NULL? That doesn't make a lot of sense either because that message did come from (or go to) somebody, even if they aren't active anymore.

You're better off just marking that user as deleted and keeping them around. It makes this and similar situations much easier to deal with.

You also mention forum threads and so on. You can't delete those either (unless there are other reasons to do so such as spam or abuse) because they're content that is related to other content (eg forum messages that have been replied to).

The only data you can safely and reasonably delete is child data. This is really the difference between aggregation and composition. The User and message relationship above is aggregation. An example of composition is House and Room. You delete a House and all the rooms go to. Rooms cannot exist without a House. This is composition or, in entity relationship terms, a parent-child relationship.

But you'll find more instances of aggregation than composition (in my experience) so the question becomes: what do you do with that data? It's really hard to erase all traces of someone without deleting things you shouldn't. Just mark them as deleted, locked or inactive and deal with it that way.

cletus
What about in terms of public access to that data, particularly data that involved others, like a forum thread? Should it remain publicly viewable?
VirtuosiMedia
+1  A: 

You could just mark the user as deleted and then whenever you display any content involving that user then you display the name as "Ex-User" or something.

This protects the departed users identity without destroying your content.

Steve Weet
+1  A: 

You should keep all the content and just mark user as deleted so other users won't be able to see his or her profile, username etc. Then another user should be able to register by the same name (since it should become free).

Koistya Navin
+1  A: 

I've been thinking about these same issues for quite some time. Honestly you shouldn't delete a thread started by a user-to-be-deleted if the other people have contributed their time and efforts to it. I remember on one forum there was a rule you can't delete your thread after somewhat 11 hours after it's been published. I guess the idea behind is that you can't take your word back after you've pronounced it.

So, better lock account but don't cascade-delete anything in relation to user.

Especially, so that they can delete their account, then register under the same name and start it all over again.

User