I want to learn the answer for different DB engines but in our case; we have some records that are not unique for a column and now we want to make that column unique which forces us to remove duplicate values. We use Oracle 10g. Is this reasonable? Or is this something like goto statement :) ? Should we really delete? What if we had millions of records?
To answer the question as posted: No, it can't be done on any RDBMS that I'm aware of.
However, like most things you can work around it, by doing the following.
Create a composite key, with a new column and the existing column
You can make it unique without deleting anything by adding a new column, call it PartialKey.
For existing rows you set PartialKey to a unique value (starting at Zero).
Create a unique constraint on the existing column and PartialKey (you can do this because each of these rows will now be unique).
For new rows, only use a default value of Zero for PartialKey (because zero has already been used), this will force the existing column to have unqiue values in the table.
IMPORTANT EDIT
This is weak - if you delete a row with partial key 0. Now another row can be added with a value that is already in the existing column, because the 0 in partial key will guarentee uniqueness.
You would need to ensure that either
- You never delete the row with partial key 0
- You always have a dummy row with partial key 0, and you never delete it (or you immediately reinsert it automatically)
Edit: Bite the bullet and clean the data
If as you said you've just realised that the column should be unique, then you should (if possible) clean up the data. The above approach is a hack, and you'll find yourself writing more hacks when accessing the table (you may find you've two sets of logic for dealing with queries against that table, one for where the column IS unique, and one where it's NOT. I'd clean this now or it'll come back and bite you in the arse a thousand times over.
This can be done in SQL Server.
When you create a check constraint, you can set an option to apply it either to new data only or to existing data as well. The option of applying the constraint to new data only is useful when you know that the existing data already meets the new check constraint, or when a business rule requires the constraint to be enforced only from this point forward.
for example
ALTER TABLE myTable
WITH NOCHECK
ADD CONSTRAINT myConstraint CHECK ( column > 100 )
You can do this using NOVALIDATE ENABLE constraint state, but deleting is much more preferred way.
In Oracle you can put a constraint in a enable novalidate state. When a constraint is in the enable novalidate state, all subsequent statements are checked for conformity to the constraint. However, any existing data in the table is not checked. A table with enable novalidated constraints can contain invalid data, but it is not possible to add new invalid data to it. Enabling constraints in the novalidated state is most useful in data warehouse configurations that are uploading valid OLTP data.