ansaurus

Question

Update column to be different aggregate values

Answer 1

A:

I believe the following query gets you what you want. This routine assumes a max of two duplicate addresses per contact. If there's more than one dup per contact, the query will have to be modified. I hope this helps.

Declare @tblminternet 
Table 
( lngInternetPK int,   
  lngContactFK int,  
  lngValue int, 
  strAddress varchar(255)
)

Insert Into @tblminternet 
select 1, 1, 33, '[email protected]' 
union
select 2, 1, 40, '[email protected]'
union 
select 3, 2, 33, '[email protected]'
union 
select 4, 2, 40, '[email protected]'
union 
select 5, 3, 2, '[email protected]'

--Select * from @tblminternet

Select  Distinct   
    A.lngContactFK , 
    A.lngValue | B.lngValue as 'Bitwise OR', 
    A.strAddress
From @tblminternet A, @tblminternet B
Where A.lngContactFK = B.lngContactFK
And A.strAddress = B.strAddress
And A.lngInternetPK != B.lngInternetPK

Ben Griswold 2009-07-13 16:48:46

This gives me a good idea of where to start, but the problem is I am dealing with a few hundred. The temp table I made is a table of all duplicates existing in tblminternet.

Corey Sunwold 2009-07-13 18:00:45

Answer 2

+1 A:

SQL Server lacks native bitwise aggregates, that's why we need to emulate them.

The main idea here is to generate a set of bits from 0 to 15, for each bit apply the bitmask to the value and select MAX (which will give us an OR for a given bit), then select the SUM (which will merge the bit masks).

The we just update the first lngInternetPK for any given (lngContactFK, strValue) with the new value of lngValue, and delete all duplicates.

;WITH   bits AS
        (
        SELECT  0 AS b
        UNION ALL
        SELECT  b + 1
        FROM    bits
        WHERE   b < 15
        ),
        v AS
        (
        SELECT  i.*,
                (
                SELECT  SUM(value)
                FROM    (
                        SELECT  MAX(lngValue & POWER(2, b)) AS value
                        FROM    tblmInternet ii
                        CROSS JOIN
                                bits
                        WHERE   ii.lngContactFK = i.lngContactFK
                                AND ii.strAddress = i.strAddress
                        GROUP BY
                                b
                        ) q
                ) AS lngNewValue
        FROM    (
                SELECT  ii.*, ROW_NUMBER() OVER (PARTITION BY lngContactFK, strAddress ORDER BY lngInternetPK) AS rn
                FROM    tblmInternet ii
                ) i
        WHERE   rn = 1
        )
UPDATE  v
SET     lngValue = lngNewValue;

;WITH    v AS
        (
        SELECT  ii.*, ROW_NUMBER() OVER (PARTITION BY lngContactFK, strAddress ORDER BY lngInternetPK) AS rn
        FROM    tblmInternet ii
        )
DELETE  v
WHERE   rn > 1

See this article in my blog for more detailed explanations:

SQL Server: aggregate bitwise OR

Quassnoi 2009-07-13 17:13:34

This is good. The only problem is that each contact can have multiple distinct emails. So we don't want to delete all other emails associated with an contact.

Corey Sunwold 2009-07-13 18:07:28

@csunwold: so you want to keep one instance of each `(contact, email)` pair?

Quassnoi 2009-07-13 19:32:17

@Quassnoi: correct. There should be one instance of an email for every contact, but multiple distinct emails are allowed for each contact.

Corey Sunwold 2009-07-13 19:34:33

@Quassnoi: I also get the error: "Incorrect syntax near the keyword 'as'." on the first line.

Corey Sunwold 2009-07-13 19:46:55

@csunwold: see post update

Quassnoi 2009-07-13 20:14:34

@Quassnoi: I am going to do some more testing tomorrow to verify all the data is as expected but it appears to work. Thank you. Can you explain why adding the semi colon in front of the with statements made a difference?

Corey Sunwold 2009-07-13 20:48:18

@csunwold: Originally, Transact-SQL had no statement separators, i. e. `SELECT 1 SELECT 2` is a valid statement block with `2` statements in `Transact-SQL` but not `ANSI`. Later `Microsoft` decided to force statement separators to be `ANSI` conformant. Forcing them for legacy statements like the one above would break the backward compatibility, but turned out to be OK for the new statements like `WITH` (which was introduced in `SQL Server 2005`).

Quassnoi 2009-07-13 21:04:52

@Quassnoi: Awesome work. Thank you I learned a lot.

Corey Sunwold 2009-07-13 21:08:59

ansaurus

tags:

views:

answers:

Update column to be different aggregate values

related questions