views:

123

answers:

6

I've had this come up a couple times in my career, and none of my local peers seems to be able to answer it. Say I have a table that has a "Description" field which is a candidate key, except that sometimes a user will stop halfway through the process. So for maybe 25% of the records this value is null, but for all that are not NULL, it must be unique.

Another example might be a table which must maintain multiple "versions" of a record, and a bit value indicates which one is the "active" one. So the "candidate key" is always populated, but there may be three versions that are identical (with 0 in the active bit) and only one that is active (1 in the active bit).

I have alternate methods to solve these problems (in the first case, enforce the rule code, either in the stored procedure or business layer, and in the second, populate an archive table with a trigger and UNION the tables when I need a history). I don't want alternatives (unless there are demonstrably better solutions), I'm just wondering if any flavor of SQL can express "conditional uniqueness" in this way. I'm using MS SQL, so if there's a way to do it in that, great. I'm mostly just academically interested in the problem.

A: 

Thanks for the comments, the initial version of this answer was wrong.

Here's a trick using a computed column that effectively allows a nullable unique constraint in SQL Server:

create table NullAndUnique 
    (
    id int identity, 
    name varchar(50),
    uniqueName as case 
        when name is null then cast(id as varchar(51)) 
        else name + '_' end,
    unique(uniqueName)
    )

insert into NullAndUnique default values
insert into NullAndUnique default values -- Works
insert into NullAndUnique default values -- not accidentally :)
insert into NullAndUnique (name) values ('Joel')
insert into NullAndUnique (name) values ('Joel') -- Boom!

It basically uses the id when the name is null. The + '_' is to avoid cases where name might be numeric, like 1, which could collide with the id.

Andomar
Really? From MSDN: "Columns that are used in a unique index should be set to NOT NULL, because multiple null values are considered duplicates when a unique index is created."
Arthur
No, SQL Server does *NOT* allow multiple null values in a column covered by a unique constraint. The SQL standard specifies that this should be allowed (as NULL is not equal to even itself), but Microsoft's SQL Server has not followed the standard on this point.
qstarin
+5  A: 

If you are using SQL Server 2008 a Index filter would maybe your solution:

http://msdn.microsoft.com/en-us/library/ms188783.aspx

This is how I enforce a Unique Index with multiple NULL values

CREATE UNIQUE INDEX [IDX_Blah] ON [tblBlah] ([MyCol]) WHERE [MyCol] IS NOT NULL
Arthur
Ah - function based index... DBAs loathe those
OMG Ponies
@OMG Ponies while sucky for DBAs it's about the only viable solution
msarchet
@msarchet: Work is getting in the way of helping out ATM, but the table setup described in the OP is not ideal to me. But enjoy fighting with DBAs =)
OMG Ponies
+1 Looks like the best answer, if you're on 2008+
Andomar
@OMG Ponies: eh? it's a filtered index, just like other DBMS have and finally added to SQL Server 2008...
gbn
OMG Ponies
@OMG Ponies, I wish I had a DBA to fight with :(
msarchet
@msarchet: Me too :(
OMG Ponies
Thanks Arthur, I don't have 2008 yet, so I went with Tom's answer, but this is very helpful, thanks. Sorry I didn't specify the server version, I didn't suppose it would matter as much as it did.
Michael Blackburn
@OMG Ponies, Your avatar is awesome where did you get that from, also here is a waffle #
msarchet
OMG Ponies
A: 

The other method for enforcing conditional rules that a unique index or check constraint can't handle is to use a trigger.

HLGEM
A: 

Oracle does. A fully null key is not indexed by a B*tree in index in Oracle, and Oracle uses B*tree indexes to enforce unique constraints.

Assuming one wished to version ID_COLUMN based on the ACTIVE_FLAG being set to 1:

CREATE UNIQUE INDEX idx_versioning_id ON mytable 
  (CASE active_flag WHEN 0 THEN NULL ELSE active_flag END,
   CASE active_flag WHEN 0 THEN NULL ELSE id_column   END);
Adam Musch
+1  A: 

In the case of descriptions which are not yet completed, I wouldn't have those in the same table as the finalized descriptions. The final table would then have a unique index or primary key on the description.

In the case of the active/inactive, again I might have separate tables as you did with an "archive" or "history" table, but another possible way to do it in MS SQL Server at least is through the use of an indexed view:

CREATE TABLE Test_Conditionally_Unique
(
    my_id   INT NOT NULL,
    active  BIT NOT NULL DEFAULT 0
)
GO
CREATE VIEW dbo.Test_Conditionally_Unique_View
WITH SCHEMABINDING
AS
    SELECT
        my_id
    FROM
        dbo.Test_Conditionally_Unique
    WHERE
        active = 1
GO
CREATE UNIQUE CLUSTERED INDEX IDX1 ON Test_Conditionally_Unique_View (my_id)
GO

INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 1)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 1)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 1)    -- This insert will fail

You could use this same method for the NULL/Valued descriptions as well.

Tom H.
Thanks, this works on 2005, and I was actually just messing around with indexed views today on a different problem. This is a really great application of that!
Michael Blackburn
A: 

I'm not entirely aware of your intended use or your tables, but you could try using a one to one relationship. Split out this "sometimes" unique column into a new table, create the UNIQUE index on that column in the new table and FK back to the original table using the original tables PK. Only have a row in this new table when the "unique" data is supposed to exist.

OLD tables:

TableA
ID    pk
Col1  sometimes unique
Col...

NEW tables:

TableA
ID
Col...

TableB
ID   PK, FK to TableA.ID
Col1 unique index
KM