views:

420

answers:

5

Hi,

I have a query that returns a result set similar to the one below (in reality it is far bigger, thousands of rows):

     A    | B  | C  |    D
     -----|----|----|-----
1    NULL | d0 | d0 | NULL
2    NULL | d0 | d1 | NULL
3    NULL | d0 | d2 |   a0
4      d0 | d1 | d1 | NULL
5      d0 | d2 | d2 |   a0

Two of the rows are considered duplicates, 1 and 2, because A, B and D are the same. To eliminate this, I could use SELECT DISTINCT A, B, D but then I do not get column C in my result set. Column C is necessary information for rows 3, 4 and 5.

So how do I come from the result set above to this one (the result appearing in C4 can also be NULL instead of d1):

     A    | B  | C    | D
     -----|----|------|-----
1    NULL | d0 | NULL | NULL
3    NULL | d0 | d2   |   a0
4      d0 | d1 | d1   | NULL
5      d0 | d2 | d2   |   a0

Kind regards,

Ronald

+5  A: 
DECLARE @YourTable TABLE (
  A VARCHAR(2)
  , B VARCHAR(2)
  , C VARCHAR(2)
  , D VARCHAR(2))

INSERT INTO @YourTable VALUES (NULL, 'd0', 'd0', NULL)
INSERT INTO @YourTable VALUES (NULL, 'd0', 'd1', NULL)
INSERT INTO @YourTable VALUES (NULL, 'd0', 'd2', 'a0')
INSERT INTO @YourTable VALUES ('d0', 'd1', 'd1', NULL)
INSERT INTO @YourTable VALUES ('d0', 'd2', 'd2', 'a0')


SELECT A, B, C = MIN(C), D
FROM @YourTable
GROUP BY A, B, D


SELECT A, B, CASE WHEN MIN(C) = MAX(C) THEN MIN(C) ELSE NULL END, D
FROM @YourTable
GROUP BY A, B, D


SELECT A, B, CASE WHEN MIN(COALESCE(C, 'dx')) = MAX(COALESCE(C, 'dx')) THEN MIN(C) ELSE NULL END, D
FROM @YourTable
GROUP BY A, B, D
Lieven
+1 - GROUP BY exists to combine "similar" rows that share the same value of a subset of columns, and thus is exactly what the OP is asking for.
Andrzej Doyle
If column C is the same for rows 1 and 2, or one value is NULL, then this does not give NULL. It works if the 2 values of column C are different and both NOT NULL
gbn
...which is undefined by OP
gbn
@gbn: the answer is updated to adress your case
Lieven
Clever solution. Thanks. I made my example a little too simple so I had a hard time applying it to the real situation but it works now.
Ronald Wildenberg
A: 

A subquery perhaps?

SELECT A,B,C,D FROM table1 WHERE EXISTS ( SELECT DISTINCT A,B,D FROM table1 );

Speedy
Tested this? Just gives 5 rows: the exists will either give all rows or no rows)
gbn
A: 

if you have an unique id in the table, then i would go for something like this:

SELECT A,B,C,D FROM table WHERE id IN (SELECT DISTINCT A,B,D)

The problem is that you would always get the first value of C, not the frist one with an value.

Gushiken
A: 

The fact you have NULLs in A and D compicates matters for any EXISTS.

Any MIN/MAX solution on C may not give you NULL as I think you want. Otherwise, use MIN(C) and a simple group by.

You have to extract the unique keys first (A, B, D), then use that to determine extract the rows again and work out what to do with C

DECLARE @TheTable TABLE (
  A varchar(2) NULL,
  B varchar(2) NULL,
  C varchar(2) NULL,
  D varchar(2) NULL
)

INSERT INTO @TheTable VALUES (NULL, 'd0', 'd0', NULL)
INSERT INTO @TheTable VALUES (NULL, 'd0', 'd1', NULL)
INSERT INTO @TheTable VALUES (NULL, 'd0', 'd2', 'a0')
INSERT INTO @TheTable VALUES ('d0', 'd1', 'd1', NULL)
INSERT INTO @TheTable VALUES ('d0', 'd2', 'd2', 'a0')

SELECT DISTINCT
    T.A,
    T.B,
    CASE Number WHEN 1 THEN T.C ELSE NULL END,
    T.D
FROM
    (SELECT
        COUNT(*) AS Number,
        A, B, D
    FROM
        @TheTable
    GROUP BY
        A, B, D
    ) UQ
    JOIN
    @TheTable T ON ISNULL(T.A, '') = ISNULL(UQ.A, '') AND ISNULL(T.B, '') = ISNULL(UQ.B, '') AND ISNULL(T.D, '') = ISNULL(UQ.D, '')
gbn
+1  A: 
Sung Meister