views:

758

answers:

4

I need to change a database to add a unique constraint on a table column, but the VARCHAR data in it is not unique.

How can I update those duplicate records so that each value is unique by adding a sequential number at the end of the existing data?

e.g. I would like to change 'name' to 'name1', 'name2', 'name3'

A: 

What database are you using?

In Oracle there is a:

NOVALIDATE Validates changes but does not validate data previously existing in the table

Example:

ALTER TABLE <table_name> ENABLE NOVALIDATE UNIQUE;

If you are not using Oracle then check the SQL reference for your respective database.

Azder
We're using FrontBase (which is SQL92 compliant)I want the existing data to be validated and updated
womich
-1, sorry, this is not what he's asking.
John M Gant
I know it is not what he asked, but I did write it before he even tell me he was using FrontBase, that I dont even know. Yet, the principle is same: the database could have something simmilar to the NOVALIDATE clause. Please don't mind the PRIMARY KEY if you think that the type of CONSTRAINT changes anything.
Azder
A: 

You could add another column to it... like

update mytable set mycolumn = concat(mycolumn, id) 
            where id in (<select duplicate records>);

replace id with whatever column makes mycolumn unique

danb
A: 

Open a cursor on the table, ordered by that column. Keep a previous value variable, initialized to null, and an index variable initialized to 0. If the current value = the previous value, increment the index and append the index to the field value. if the current value <> the previous value, reset the index to 0 and keep the field value as is. Set the previous value variable = the current value. Move on to the next row and repeat.

John M Gant
+1  A: 

Here are 2 examples with using the MS SQL SERVER flavor of sql.

Setup Example:

create table test (id int identity primary key, val varchar(20) )
 --id is a pk for the cursor so it can update using "where current of"

-- name a is not duplicated
-- name b is duplicated 3 times
-- name c is duplicated 2 times

insert test values('name a')
insert test values('name b')
insert test values('name c')
insert test values('name b')
insert test values('name b')
insert test values('name c')

Sql 2005\2008: ( Computed Table Expression )

begin tran; -- Computed table expressions require the statement prior to end with ;

with cte(val,row) as (

 select val, row_number() over (partition by val order by val) row
 --partiton is important. it resets the row_number on a new val
 from test 
 where val in ( -- only return values that are duplicated
  select val
  from test
  group by val
  having count(val)>1
 )
)
update cte set val = val + ltrim(str(row))
--ltrim(str(row)) = converting the int to a string and removing the padding from the str command.

select * from test

rollback

Sql 2000: (Cursor example)

begin tran

declare @row int, @last varchar(20), @current varchar(20)
set @last = ''
declare dupes cursor
 for
 select val 
 from test 
 where val in ( -- only return values that are duplicated
  select val
  from test
  group by val
  having count(val)>1
 )
 order by val

 for update of val

open dupes
fetch next from dupes into @current
while @@fetch_status = 0
begin
 --new set of dupes, like the partition by in the 2005 example
 if @last != @current
  set @row = 1

 update test
  --@last is being set during the update statement
  set val = val + ltrim(str(@row)), @last = val
  where current of dupes

 set @row = @row + 1

 fetch next from dupes into @current
end
close dupes
deallocate dupes

select * from test

rollback

I rolled back each of the updates because my script file contains both examples. This allowed me to test the functionality without resetting the rows on the table.

jhamm