views:

2668

answers:

3

I'm using SQL Server 2005. With the query below (simplified from my real query):

select a,count(distinct b),sum(a) from 
(select 1 a,1 b union all
select 2,2 union all
select 2,null union all
select 3,3 union all
select 3,null union all
select 3,null) a
group by a

Is there any way to do a count distinct without getting

"Warning: Null value is eliminated by an aggregate or other SET operation."

Here are the alternatives I can think of:

  1. Turning ANSI_WARNINGS off
  2. Separating into two queries, one with count distinct and a where clause to eliminate nulls, one with the sum:

    select t1.a, t1.countdistinctb, t2.suma from
    (
     select a,count(distinct b) countdistinctb from 
     (
      select 1 a,1 b union all
      select 2,2 union all
      select 2,null union all
      select 3,3 union all
      select 3,null union all
      select 3,null
     ) a
     where a.b is not null
     group by a
    ) t1
    left join
    (
     select a,sum(a) suma from 
     (
      select 1 a,1 b union all
      select 2,2 union all
      select 2,null union all
      select 3,3 union all
      select 3,null union all
      select 3,null
     ) a
     group by a
    ) t2 on t1.a=t2.a
    
  3. Ignore the warning in the client

Is there a better way to do this? I'll probably go down route 2, but don't like the code duplication.

A: 

Anywhere you have a null possibly returned, use

CASE WHEN Column IS NULL THEN -1 ELSE Column END AS Column

That will sub out all your Null Values for -1 for the duration of the query and they'll be counted/aggregated as such, then you can just do the reverse in your fine wrapping query...

SELECT  
    CASE WHEN t1.a = -1 THEN NULL ELSE t1.a END as a
    , t1.countdistinctb
    , t2.suma
Eoin Campbell
I wanted to avoid the splitting into two queries and combining. Thanks to your idea I've worked it out though, I'll post an answer.
Simon D
Don't like this idea at all! What if the data is late arriving? Or 'unknown' is a perfectly valid state?
adolf garlic
I'm specifically designing for a case where I don't want to include NULL values in the count.
Simon D
+1  A: 
select a,count(distinct isnull(b,-1))-sum(distinct case when b is null then 1 else 0 end),sum(a) from 
 (select 1 a,1 b union all
 select 2,2 union all
 select 2,null union all
 select 3,3 union all
 select 3,null union all
 select 3,null) a
 group by a

Thanks to Eoin I worked out a way to do this. You can count distinct the values including the nulls and then remove the count due to nulls if there were any using a sum distinct.

Simon D
A: 

If you don't like the code duplication then why not use a common table expression? e.g.

WITH x(a, b) AS 
        (
                select 1 a,1 b union all
                select 2,2 union all
                select 2,null union all
                select 3,3 union all
                select 3,null union all
                select 3,null
        ) 
select t1.a, t1.countdistinctb, t2.suma from
(
        select a,count(distinct b) countdistinctb from 
        x a
        where a.b is not null
        group by a
) t1
left join
(
        select a,sum(a) suma from 
        x a
        group by a
) t2 on t1.a=t2.a
onedaywhen
That's a good idea I hadn't thought of. But the code duplication I really didn't like was the group by and joins, which can't be got rid of like this. Thanks though.
Simon D