tags:

views:

182

answers:

7

in our database schema, we like to use delete flags. When a record is deleted, we then update that field, rather than run a delete statement. The rest of our queries then check for the delete flag when returning data.

Here is the problem:

The delete flag is a date, with a default value of NULL. This is convenient because when a record is deleted we can easily see the date that it was deleted on.

However, to enforce unique constraints properly, we need to include the delete flag in the unique constraint. The problem is, on MS SQL , it behaves in accordance to what we want (for this design), but in postgresql, if any field in a multi column unique constraint is NULL, it allows the field. This behavior fits the SQL standard, but it makes our design broken.

The options we are considering are:

  1. make a default value for the deleted field to be some hardcoded date

  2. add a bit flag for deleted, then each table would have 2 delete related fields - date_deleted and is_deleted (for example)

  3. change the date_deleted to is_deleted (bit field)

I suspect option 1 is a performance hit, each query would have to check for the hardcoded date, rather than just checking for IsNUll. Plus it feels wrong.

Option 2, also, feels wrong - 2 fields for "deleted" is non-dry.

Option 3, we lose the "date" information. There is a modified field, which would, in theory reflect the date deleted, but only assuming the last update to the row was the update to the delete bit.

So, Any suggestions? What have you done in the past to deal with "delete flags" ?

Update Thanks to everyone for the super quick, and thoughtful responses. We ended up going with a simple boolean field and a modified date field (with a trigger). I just noticed the partial index suggestion, and that looks like the perfect solution for this problem (but I havent actually tried it)

+2  A: 

One of my favourite solutions is an is_deleted bit flag, and a last_modified date field.

The last_modified field is updated automatically every time the row is modified (using any technique supported by your DBMS.) If the is_deleted bit flag is TRUE, then the last_modified value implies the time when the row was deleted.

You will then be able to set the default value of last_modified to GETDATE(). No more NULL values, and this should work with your unique constraints.

Daniel Vassallo
this is the solution we are going with. thanks(it looks like you can only accept 1 answer though)
Gush
Default constraints only apply to INSERT statements. UPDATEs would not cause such a column to be modified.
Philip Kelley
@Philip. I intended a default constraint for the INSERT only. That would timestamp a new record in the `last_modified` field automatically. UPDATES would have to be handled with triggers, etc.
Daniel Vassallo
+2  A: 

Option 3, we lose the "date" information. There is a modified field, which would, in theory reflect the date deleted, but only assuming the last update to the row was the update to the delete bit.

Is there a business reason that the record would be modified after it was deleted? If not, are you worrying about something that's not actually an issue? =)

In the system I currently work on we have the following "metadata" columns _Deleted, _CreatedStamp, _UpdatedStamp, _UpdatedUserId, _CreatedUserId ... quite a bit, but it's important for this system to carry that much data. I'd suggest going down the road of having a separate flag for Deleted to Modified Date / Deleted Date. "Diskspace is cheap", and having two fields to represent a deleted record isn't world-ending, if that's what you have to do for the RDBMS you're using.

Rob
+1  A: 

What about triggers? When a record is deleted, a post-update trigger copies the row into an archive table which has the same structure plus any additional columns, and an additional column of the date/time and perhaps the user that deleted it.

That way your "live" table only has records that are actually live, so is better performance-wise, and your application doesn't have to worry about whether a record has been deleted or not.

Andy Shellam
tirggers and archive tables seem to me like a good solution, but we are choosing the option 1 instead, because it is the simplest improvement from our existing design.
Gush
+3  A: 

If just retaining the deleted records is important to you, have you considered just moving them to a history table?

  • This could easily be achieved with a trigger.
  • Application logic doesn't need to account for this deleted flag.
  • Your tables would stay lean and mean when selecting from it.
  • It would solve your problem with unique indexes.
Lieven
thanks for this suggestion, in our case, our requirements and desire to keep it simple make us decide to go with the is_deleted, last_modified.
Gush
A: 

Would creating a multi column unique index that included the deleted date achieve the same constraint limit you need?

http://www.postgresql.org/docs/current/interactive/indexes-unique.html

Alternately, can you store a non-NULL and check that the deleted date to the minimum sql date = 0 or "1/1/1753" instead of NULL for undeleted records.

jasonk
checking for null is quicker than checking for a date value, and this check will happend all the time, so we want to avoid this solution.And we cannot use NULL since, as that link points out: "Null values are not considered equal."
Gush
"Null values are not considered equal." It's too early and I need more caffeine. The first I read that I was thinking: 56-3873457-1 (AccountNumber) + Tom Jones (AccountName) + NULL (Deleted) <> 56-3873457-1 (AccountNumber) + Tom Jones (AccountName) + '2/10/2010' Deleted not what it actually says: 56-3873457-1 (AccountNumber) + Tom Jones (AccountName) + NULL (Deleted) <> 56-3873457-1 (AccountNumber) + Tom Jones (AccountName) + NULL (Deleted
jasonk
A: 

Is it possible to exclude the deleted date field from your unique index? In what way does this field contribute to the uniqueness of each record, especially if the field is usually null?

Ray
A deleted record could cause an "undeleted" record to violate the UQ constraint, unless the delete field is in the constraint
Gush
I was picturing, say, a product table with a unique index on a SKU field. If you flag a product as deleted, you do not want to be able to add a new product with the same sku. Including the deleted date field in the unique index would allow this. Obviously, I don't know the details of your situation, so I guess this doesn't apply.
Ray
+1  A: 

Just create a conditional unique constraint:

CREATE UNIQUE INDEX i_bla ON yourtable (colname) WHERE date_deleted IS NULL;
Frank Heikens
cool. this is a "partial index" right ?http://www.postgresql.org/docs/8.0/static/indexes-partial.html I did not even know these existed.
Gush
Yep, partial index and very handy. PostgreSQL has great support for a kind of indexes, just use them, create them and optimize them.And you'd better use a more current version of PostgreSQL, or is just the manual that's so old? 8.0 is not even supported anymore... ;)
Frank Heikens