tags:

views:

85

answers:

2

I have this sql query retrieving IDs from a table with 3 columns: ID, Country and Age

SELECT Country, 
(CASE 
 WHEN AGE BETWEEN 0 AND 9 THEN '0-9'
 WHEN AGE BETWEEN 10 AND 19 THEN '10-19'
 WHEN AGE BETWEEN 20 AND 29 THEN '20-29'
 WHEN AGE BETWEEN 30 AND 39 THEN '30-39'
 WHEN AGE BETWEEN 40 AND 49 THEN '40-49'
 ELSE '50+'
END) Age_Bins, COUNT (DISTINCT ID)
FROM MYTABLE
GROUP BY Country, Age_Bins;

The results I get are something like:

UK '0-9' 7; 
UK '20-29' 14; 
etc... 

But what I'd like to have also is UK '10-19' 0 (where there are no IDs in that age section). How can the sql code be modified accordingly to also give outputs with zero counts. Thanks

+7  A: 

Ideally you want a table of "age bins" and a table of countries to use like this:

select c.Country, b.age_bin, count(distinct m.id)
from countries c
cross join age_bins b
left outer join mytable m on m.country = c.country
                          and m.age between b.min_age and b.max_age

If ncecessary you can fake the tables like this:

WITH countries as (select distinct country from mytable),
     age_bins as (select '0-9' age_bin, 0 min_age, 9 max_age from dual
                  union all
                  select '10-19' age_bin, 10 min_age, 19 max_age from dual
                  union all
                  ...
                 ),
select c.Country, b.age_bin, count(distinct m.id)
from countries c
cross join age_bins b
left outer join mytable m on m.country = c.country
                          and m.age between b.min_age and b.max_age
Tony Andrews
+1 I was half way through typing the same answer :)
matja
Damn, exactly the idea i had. Except i got the `count(distinct)` part wrong.
Constantin
+3  A: 

You could create each age-bin as a case-based column, returning either 0 or 1, and use SUM() instead of COUNT()

 select V.country, sum(V.Zero2Nine) as [0-9], sum(V.Ten2Nineteen) as [10-19] ...
 from
 (
  select country,
   (case when age between 0 and 9 then 1 else 0 end) as Zero2Nine,
   (case when age between 10 and 19 then 1 else 0 end) as Ten2Nineteen
   from ...
  )  as V

 group by  V.country 
Tim
I like this solution, because it seems simple, although it doesn't directly use "COUNT". However I don't know if it has any drawbacks compared to the first solution presented above.
francogrex
I use this virtual-column approach all the time in real-world applications that require "bins" or "buckets". It performs quickly and the results are accurate. The query is simple to understand and maintain. I know of no drawbacks. How about an up-vote? :-)
Tim
Sure. I voted up for both solutions and I'm giving this the best answer :)
francogrex