ansaurus

Question

Find duplicates in database and rename one

Answer 1

+1 A:

If you have some sort of unique id on the table:

UPDATE articles a1 set url = a1.url||'_2' 
WHERE a1.id not in (select max(a2.id) from articles a2 group by lower(a2.url));

If you don't have an unique id:

UPDATE articles a1 set url = a1.url||'_2' 
WHERE a1.ctid not in (select max(a2.ctid) from articles a2 group by lower(a2.url));

rfusca 2010-07-31 00:33:53

Can you please explain how these statements work? Is this saying to update the _articles_ record that does _not_ have the maximum ID, but that shares a case-insensitive URL with another record? If so, what happens if there's more than one match? Would it convert the URLs of _all_ but the record with the maximum ID?

seh 2010-07-31 01:13:01

Update every row that is not the maximum id for each set of case insensitive urls. Yes, it would convert the URLs of all but the record with the max id.

rfusca 2010-07-31 01:46:28

Thanks again! My own personal database savior. In order to catch multiple duplicates, maybe I'll run the same thing over and over while incrementing the numeral until nothing duplicates anymore. On later runs, I suppose I'd need to cook up a way to tell the database to first chop off the _2 and replace it with a _3.

WIlliam Jones 2010-07-31 04:41:27

You *could* greatly reduce those number of runs by replacing '_2' with '_'||round(random()*100). Its not perfect, but if there's only a few variations, you're unlikely to get a repeat.

rfusca 2010-07-31 05:45:55

ansaurus

tags:

views:

answers:

Find duplicates in database and rename one

related questions