views:

199

answers:

4

I have a table of students:

id | age
--------
0  | 25
1  | 25
2  | 23

I want to query for all students, and an additional column that counts how many students are of the same age:

id | age | count
----------------
0  | 25  | 2
1  | 25  | 2
2  | 23  | 1

What's the most efficient way of doing this? I fear that a sub-query will be slow, and I'm wondering if there's a better way. Is there?

+2  A: 

This should work:

SELECT age, count(age) 
  FROM Students 
 GROUP by age

If you need the id as well you could include the above as a sub query like so:

SELECT S.id, S.age, C.cnt
  FROM Students  S
       INNER JOIN (SELECT age, count(age) as cnt
                     FROM Students 
                    GROUP BY age) C ON S.age = C.age
Miky Dinescu
for the second query, the outer select should be on C.cnt because there is no S.cnt, otherwise you get an error: Invalid column name 'cnt'
KM
A: 
select s.id, s.age, c.count
from students s
inner join (
    select age, count(*) as count
    from students
    group by age
) c on s.age = c.age
order by id
RedFilter
update: fixed a typo
RedFilter
+1  A: 

I would do something like:

select
 A.id, A.age, B.count 
from 
 students A, 
 (select age, count(*) as count from sudents group by age) B
where A.age=B.age;
quosoo
+2  A: 

If you're using Oracle, then a feature called analytics will do the trick. It looks like this:

select id, age, count(*) over (partition by age) from students;

If you aren't using Oracle, then you'll need to join back to the counts:

select a.id, a.age, b.age_count
  from students a
  join (select age, count(*) as age_count
          from students
         group by age) b
    on a.age = b.age
jbourque
+1, first query works for SQL Server 2005 and up too
KM
FYI, On SQL Server 2005, the second query runs with almost half the execution cost (using _SET SHOWPLAN_ALL ON_) as the first. I thought the first would have been better, but the old school join beat it.
KM