tags:

views:

2198

answers:

4

How would I be able to get N results for several groups in an oracle query.

For example, given the following table:

|--------+------------+------------|
| emp_id | name       | occupation |
|--------+------------+------------|
|      1 | John Smith | Accountant |
|      2 | Jane Doe   | Engineer   |
|      3 | Jack Black | Funnyman   |
|--------+------------+------------|

There are many more rows with more occupations. I would like to get three employees (lets say) from each occupation.

Is there a way to do this without using a subquery?

A: 

I'm not sure this is very efficient, but maybe a starting place?

select * from people p1 join people p2 on p1.occupation = p2.occupation join people p3 on p1.occupation = p3.occupation and p2.occupation = p3.occupation where p1.emp_id != p2.emp_id and p1.emp_id != p3.emp_id

This should give you rows that contain 3 distinct employees all in the same occupation. Unfortunately, it will give you ALL combinations of those.

Can anyone pare this down please?

Bill James
+5  A: 

I don't have an oracle instance handy right now so I have not tested this:

select *
from (select emp_id, name, occupation
      rank() over ( partition by occupation order by emp_id) rank
      from employee)
where rank <= 3

Here is a link on how rank works: http://www.psoug.org/reference/rank.html

jop
Didnt he specify without a subquery...?
AviD
Yes, but he may well have meant "without using a subquery that selects from the same table again". This solution uses a subquery but only accesses the table once.
Tony Andrews
A: 

tested this in SQL Server (and it uses subquery)

select emp_id, name, occupation from employees t1 where emp_id IN (select top 3 emp_id from employees t2 where t2.occupation = t1.occupation)

just do an ORDER by in the subquery to suit your needs

Leon Tayson
+1  A: 

This produces what you want, and it uses no vendor-specific SQL features like TOP N or RANK().

SELECT e.name, e.occupation 
FROM emp AS e 
  LEFT OUTER JOIN emp AS e2 
    ON (e.occupation = e2.occupation AND e.emp_id <= e2.emp_id) 
GROUP BY e.emp_id 
HAVING COUNT(*) <= 3 
ORDER BY e.occupation;

In this example it gives the three employees with the lowest emp_id values per occupation. You can change the attribute used in the inequality comparison, to make it give the top employees by name, or whatever.

Bill Karwin
I think this will fail on the GROUP BY clause. There are no aggregate functions in your select list.
jop
I did test this query before posting it. SQL doesn't require aggregate functions to use GROUP BY.
Bill Karwin