I am creating a script that for "merging" and deleting duplicate rows from a table. The table contains address information, and uses an integer field for storing information about the email as bit flags (column name lngValue). For example, lngValue & 1 == 1 means its the primary address.
There are instances of the same email being entered twice, but sometimes with different lngValues. To resolve this, I need to take the lngValue from all duplicates and assign them to one surviving record and delete the rest.
My biggest headache so far as been with the "merging" of the records. What I want to do is bitwise or all lngValues of duplicate records together. Here is what I have so far, which only finds the value of all lngValues bitwise or'ed together.
Warning: messy code ahead
declare @duplicates table
(
lngInternetPK int,
lngContactFK int,
lngValue int
)
insert into @duplicates (lngInternetPK, lngContactFK, lngValue)
(
select tblminternet.lngInternetPK, tblminternet.lngContactFK, tblminternet.lngValue from tblminternet inner join
(select strAddress, lngcontactfk, count(*) as count from tblminternet where lngValue & 256 <> 256 group by strAddress, lngcontactfk) secondemail
On tblminternet.strAddress = secondemail.strAddress and
tblminternet.lngcontactfk = secondemail.lngcontactfk
where count > 1 and tblminternet.strAddress is not null and tblminternet.lngValue & 256 <> 256 --order by lngContactFK, strAddress
)
update @duplicates set lngValue = t.val
from
(select (sum(dupes.lngValue) & 65535) as val from
(select here.lngInternetPK, here.lngContactFK, here.lngValue from tblminternet here inner join
(select strAddress, lngcontactfk, count(*) as count from tblminternet where lngValue & 256 <> 256 group by strAddress, lngcontactfk) secondemail
On here.strAddress = secondemail.strAddress and
here.lngcontactfk = secondemail.lngcontactfk
where count > 1 and here.strAddress is not null and here.lngValue & 256 <> 256) dupes, tblminternet this
where this.lngContactFK = dupes.lngContactFK
) t
where lngInternetPK in (select lngInternetPK from @duplicates)
Edit:
As requested here is some sample data:
Table Name: tblminternet
Column Names:
lngInternetPK
lngContactFK
lngValue
strAddress
Example row 1:
lngInternetPK: 1
lngContactFK: 1
lngValue: 33
strAddress: "[email protected]"
Example row 2:
lngInternetPK: 2
lngContactFK: 1
lngValue: 40
strAddress: "[email protected]"
If these two were merged here is the desired result:
lngInternetPK: 1
lngContactFK: 1
lngValue: 41
strAddress: "[email protected]"
Other necessary rules:
Each contact can have multiple emails, but each email row must be distinct ( each email can only appear as one row).