ansaurus

Question

How can I efficiently do a database massive update?

Answer 1

+3 A:

WITH q AS (
        SELECT  m.*, ROW_NUMBER() OVER (PARTITION BY CODE, ALPHA3CODE, RELATEDYEAR ORDER BY CASE WHEN PreviousValue = 'INDEFINITO' THEN 1 ELSE 0 END)
        FROM    MCS_ImportedData_GenericData m
        WHERE   PreviousValue <> 'INDEFINITO'
        )
DELETE
FROM    q
WHERE   rn > 1

Quassnoi 2009-04-09 16:19:32

Answer 2

A:

Quassnoi's answer uses SQL Server 2005+ syntax, so I thought I'd put in my tuppence worth using something more generic...

First, to delete all the duplicates, but not the "original", you need a way of differentiating the duplicate records from each other. (The ROW_NUMBER() part of Quassnoi's answer)

It would appear that in your case the source data has no identity column (you create one in the temp table). If that is the case, there are two choices that come to my mind:
1. Add the identity column to the data, then remove the duplicates
2. Create a "de-duped" set of data, delete everything from the original, and insert the de-deduped data back into the original

Option 1 could be something like... (With the newly created ID field)

DELETE
   [data]
FROM
   MCS_ImportedData_GenericData AS [data]
WHERE
   id > (
         SELECT
            MIN(id)
         FROM
            MCS_ImportedData_GenericData
         WHERE
            CODE = [data].CODE
            AND ALPHA3CODE = [data].ALPHA3CODE
            AND RELATEDYEAR = [data].RELATEDYEAR
        )

OR...

DELETE
   [data]
FROM
   MCS_ImportedData_GenericData AS [data]
INNER JOIN
(
   SELECT
      MIN(id) AS [id],
      CODE,
      ALPHA3CODE,
      RELATEDYEAR
   FROM
      MCS_ImportedData_GenericData
   GROUP BY
      CODE,
      ALPHA3CODE,
      RELATEDYEAR
)
AS [original]
   ON [original].CODE = [data].CODE
   AND [original].ALPHA3CODE = [data].ALPHA3CODE
   AND [original].RELATEDYEAR = [data].RELATEDYEAR
   AND [original].id <> [data].id

Dems 2009-04-09 18:12:31

Answer 3

A:

I don't understand used syntax perfectly enough to post an exact answer, but here's an approach.

Identify rows you want to preserve (eg. select value, ... from .. where ...)

Do the update logic while identifying (eg. select value + 1 ... from ... where ...)

Do insert select to a new table.

Drop the original, rename new to original, recreate all grants/synonyms/triggers/indexes/FKs/... (or truncate the original and insert select from the new)

Obviously this has a prety big overhead, but if you want to update/clear millions of rows, it will be the fastest way.

Michal Pravda 2009-07-07 10:17:30

ansaurus

tags:

views:

answers:

How can I efficiently do a database massive update?

related questions