tags:

views:

41

answers:

2

I'm trying to count the number of people by age ranges, and I can almost do it with 2 problems:

  1. If there are no people in a given age range (NULL), then that age range does not appear in the results. For example, in my data there's no entries for "Over 80" so that date range does not appear. Basically, it looks like a mistake in the programming when there are missing date ranges.

  2. I'd like to order the results in a specific way. In the query below, because the ORDER BY is by age_range, the results for '20 - 29' come before the results for 'Under 20'.

Here's a sample of the db table "inquiries":

inquiry_id  birth_date
1           1960-02-01
2           1962-03-04
3           1970-03-08
4           1980-03-02
5           1990-02-08

Here's the query:

SELECT
    CASE
        WHEN age < 20 THEN 'Under 20'
        WHEN age BETWEEN 20 and 29 THEN '20 - 29'
        WHEN age BETWEEN 30 and 39 THEN '30 - 39'
        WHEN age BETWEEN 40 and 49 THEN '40 - 49'
        WHEN age BETWEEN 50 and 59 THEN '50 - 59'
        WHEN age BETWEEN 60 and 69 THEN '60 - 69'
        WHEN age BETWEEN 70 and 79 THEN '70 - 79'
        WHEN age >= 80 THEN 'Over 80'
        WHEN age IS NULL THEN 'Not Filled In (NULL)'
    END as age_range,
    COUNT(*) AS count

    FROM (SELECT TIMESTAMPDIFF(YEAR, birth_date, CURDATE()) AS age FROM inquiries) as derived

    GROUP BY age_range

    ORDER BY age_range

Here's a simple solution based on the suggestion by Wrikken:

SELECT
    SUM(IF(age < 20,1,0)) as 'Under 20',
    SUM(IF(age BETWEEN 20 and 29,1,0)) as '20 - 29',
    SUM(IF(age BETWEEN 30 and 39,1,0)) as '30 - 39',
    SUM(IF(age BETWEEN 40 and 49,1,0)) as '40 - 49',
    SUM(IF(age BETWEEN 50 and 59,1,0)) as '50 - 59',
    SUM(IF(age BETWEEN 60 and 69,1,0)) as '60 - 69',
    SUM(IF(age BETWEEN 70 and 79,1,0)) as '70 - 79',
    SUM(IF(age >=80, 1, 0)) as 'Over 80',
    SUM(IF(age IS NULL, 1, 0)) as 'Not Filled In (NULL)'

FROM (SELECT TIMESTAMPDIFF(YEAR, birth_date, CURDATE()) AS age FROM inquiries) as derived
A: 
  1. Create a table that contains all ranges and use outer join.
  2. Order by numeric value in another column of that table

    SELECT range, .... FROM ranges LEFT JOIN (Your subquery) ON (ranges.range = your_range) ... ORDER BY range.year ASC

Naktibalda
This seems to be a solution that will take care of both my initial problems, however I'm having trouble figuring out the JOIN clause. In the solution above, what would "your_range" be?
Mitchell
A: 

An alternative to the range table (which has my preference), a single-row answer could be:

SELECT
    SUM(IF(age < 20,1,0) as 'Under 20',
    SUM(IF(age BETWEEN 20 and 29,1,0) as '20 - 29',
    SUM(IF(age BETWEEN 30 and 39,1,0) as '30 - 39',
...etc.
FROM inquiries;
Wrikken
I just tried using the SUM approach, and it's simple and works perfectly. It puts the SUMs in the order specified. I've put the final solution in the original question in case anyone want to see it.
Mitchell