views:

35

answers:

2

I started an ASP.NET project with Entity Framework 4 for my DAL, using SQL Server 2008. In my database, I have a table Users that should have many rows (5.000.000 for example).

Initially I had my Users table designed like this:

Id  uniqueidentifier    
Name    nvarchar(128)   
Password    nvarchar(128)
Email   nvarchar(128)
Role_Id int 
Status_Id   int

I've modified my table, and added a MarkedForDeletion column:

Id  uniqueidentifier    
Name    nvarchar(128)   
Password    nvarchar(128)
Email   nvarchar(128)
Role_Id int 
Status_Id   int 
MarkedForDeletion   bit

Should I delete every entity each time, or use the MarkedForDeletion attribute. This means that I need to update the value and at some moment in time to delete all users with the value set to true with a stored procedure or something similar.

Wouldn't the update of the MarkedForDeletion attribute cost the same as a delete operation?

+2  A: 

This may depend on what you want to do with the information. For instance, you may want to mark a user for deletion but not delte all his child records (say something like forum posts), inthsi case you should markfor deletion or use a delted date field. If you do this, create a view to use for all active users (called ActiveUsers) , then insist that the view beused in any query for login or where you only want to see the active users. That will help prevent query errors from when you forget to exlude the inactive ones. If your system is active, do not make this change without going through and adjusting all queries that need to use the new view.

Another reason to use the second version is to prevent slowdowns when delting large numbers of child records. They no longer need to be deleted if you use a deleted flag. This can help performance becasue less resources are needed. Additionally you can flag records for deltion and then delte them inthe inthe middle of the night (or move to a history table) to keep the main tables smaller but still not affect performance during peak hours.

HLGEM
+2  A: 

Depending on the requirements/needs/future needs of your system, consider moving your 'deleted' entities over to a new table. Setup an 'audit' table to hold those that are deleted. Consider the case where someone wants something 'restored'.

To your question on performance: would the update be the same cost as a delete? No. The update would be a much lighter operation, especially if you had an index on the PK (errrr, that's a guid, not an int). The point being that an update to a bit field is much less expensive. A (mass) delete would force a reshuffle of the data. Perhaps that job belongs during a downtime or a low-volume period.

Regarding performance: benchmark it to see what happens! Given your table with 5 million rows, it'd be nice to see how your SQL Server performs, in its current state of indexes, paging, etc, with both scenarios. Make a backup of your database, and restore into a new database. Here you can sandbox as you like. Run & time the scenarios:

  • mass delete vs.
  • update a bit or smalldatetime field vs.
  • move to an audit table

In terms of books, try:

p.campbell
i see. can you recomend a book of "best practices" for database design/operations? thank you!
SorinA.
thank you for the advice. i'll test this weekend.
SorinA.