views:

411

answers:

4

I have a table that holds only two columns - a ListID and PersonID. When a person is merged with another in the system, I was to update all references from the "source" person to be references to the "destination" person.

Ideally, I would like to call something simple like

UPDATE MailingListSubscription
SET PersonID = @DestPerson
WHERE PersonID = @SourcePerson

However, if the destination person already exists in this table with the same ListID as the source person, a duplicate entry will be made. How can I perform this action without creating duplicated entries? (ListID, PersonID is the primary key)

EDIT: Multiple ListIDs are used. If SourcePerson is assigned to ListIDs 1, 2, and 3, and DestinationPerson is assigned to ListIDs 3 and 4, then the end result needs to have four rows - DestinationPerson assigned to ListID 1, 2, 3, and 4.

+3  A: 
--out with the bad
DELETE
FROM MailingListSubscription
WHERE PersonId = @SourcePerson
  and ListID in (SELECT ListID FROM MailingListSubscription WHERE PersonID = @DestPerson)

--update the rest (good)
UPDATE MailingListSubscription
SET PersonId = @DestPerson
WHERE PersonId = @SourcePerson
David B
better performance with exists instead of in
Eduardo Campañó
Wrote both in query analyzer. Looked at execution plans. Showed identical plans. "Exists" vs "in" does not matter. Index on PersonId matters greatly.
David B
A: 

I have to agree with David B here. Remove all the older stuff that shouldn't be there and then do your update.

Kaius
A: 

Actually I think you should go back and reconsider your database design as you really shouldn't be in circumstances where you're changing the the primary key for a record as you're proposing to do - it implies that the PersonID column is not actually a suitable primary key in the first place.

My guess is your PersonID is exposed to your users, they've renumbered their database for some reason and you're syncing the change back in. This is generally a poor idea as it breaks audit trails and temporal consistency. In these circumstances it's generally better to use you're own non-changing primary key - usually an identity - and set up the PersonID that the users see as an attribute of that. It's extra work but will give you additional consistency and robustness in the long run.

A good rule of thumb is the primary key of a record should not be exposed to the users where possible and you should only do so after careful consideration. OK I confess to breaking this myself on numerous occasions but it's worth striving for where you can :-)

Cruachan
I don't think this is the case. It could be as simple as the data entry clerk not finding Bill Smith in the system (it's there as William Smith), so a new entry was created. The error is found later, so Bill and William Smith are merged. In both cases, the DB autogenerated the IDs.
JeremyDWill
A: 

First you should subscribe destperson to all lists that SourcePerson is subscribed to that Destperson isn't already subscibed. Then delete all the SourcePersons subscriptions. This will work with multiple ListIDs.

Insert into MailingListSubscription
(
   ListID,
   PersonID
)
Select
   ListID,
   @DestPerson
From
   MailingListSubscription as t1
Where
   PersonID = @SourcePerson and
   Not Exists
   (
      Select *
      From MailingListSubscription as t2
      Where
         PersonID = @DestPerson and
         t1.ListID = t2.ListID
   )



Delete From MailingListSubscription
Where
   PersonID = @SourcePerson
Aheho