views:

1284

answers:

6

Consider the Oracle "emp" table. I'd like to get the employees with the top salary with department = 20 and job = clerk. Also assume that there is no "empno" column, and that the primary key involves a number of columns. You can do this with:

select * from scott.emp
where 
    deptno = 20 and job = 'CLERK'
    and sal =  (
        select max(sal) from scott.emp
        where deptno = 20 and job = 'CLERK')

This works, but I have to duplicate the test deptno = 20 and job = 'CLERK', which I would like to avoid. Is there a more elegant way to write this, maybe using a group by? BTW, if this matters, I am using Oracle.

+2  A: 

The following is slightly over-engineered, but is a good SQL pattern for "top x" queries.

SELECT 
 * 
FROM 
 scott.emp
WHERE 
 (deptno,job,sal) IN
 (SELECT 
   deptno,
   job,
   max(sal) 
  FROM 
   scott.emp
  WHERE 
   deptno = 20 
   and job = 'CLERK'
  GROUP BY 
   deptno,
   job
  )

Also note that this will work in Oracle and Postgress (i think) but not MS SQL. For something similar in MS SQL see question SQL Query to get latest price

Mark Nold
A: 

That's great! I didn't know you could do a comparison of (x, y, z) with the result of a SELECT statement. This works great with Oracle.

As a side-note for other readers, the above query is missing a "=" after "(deptno,job,sal)". Maybe the Stack Overflow formatter ate it (?).

Again, thanks Mark.

Alessandro Vernet
No problems, i've edited the answer :) it should be an IN
Mark Nold
+1  A: 

If I was certain of the targeted database I'd go with Mark Nold's solution, but if you ever want some dialect agnostic SQL*, try

SELECT * 
FROM scott.emp e
WHERE e.deptno = 20 
AND e.job = 'CLERK'
AND e.sal = (
  SELECT MAX(e2.sal) 
  FROM scott.emp e2
  WHERE e.deptno = e2.deptno 
  AND e.job = e2.job
)

*I believe this should work everywhere, but I don't have the environments to test it.

Steve Bosman
A: 

In Oracle you can also use the EXISTS statement, which in some cases is faster.

For example... SELECT name, number FROM cust WHERE cust IN ( SELECT cust_id FROM big_table ) AND entered > SYSDATE -1 would be slow.

but SELECT name, number FROM cust c WHERE EXISTS ( SELECT cust_id FROM big_table WHERE cust_id=c.cust_id ) AND entered > SYSDATE -1 would be very fast with proper indexing. You can also use this with multiple parameters.

Greg Ogle
A: 

There are many solutions. You could also keep your original query layout by simply adding table aliases and joining on the column names, you would still only have DEPTNO = 20 and JOB = 'CLERK' in the query once.

SELECT 
  * 
FROM 
  scott.emp emptbl
WHERE
  emptbl.DEPTNO = 20 
  AND emptbl.JOB = 'CLERK'
  AND emptbl.SAL =  
    (
      select 
        max(salmax.SAL) 
      from 
        scott.emp salmax
      where 
        salmax.DEPTNO = emptbl.DEPTNO
        AND salmax.JOB = emptbl.JOB
    )

It could also be noted that the key word "ALL" can be used for these types of queries which would allow you to remove the "MAX" function.

SELECT 
  * 
FROM 
  scott.emp emptbl
WHERE
  emptbl.DEPTNO = 20 
  AND emptbl.JOB = 'CLERK'
  AND emptbl.SAL >= ALL  
    (
      select 
        salmax.SAL
      from 
        scott.emp salmax
      where 
        salmax.DEPTNO = emptbl.DEPTNO
        AND salmax.JOB = emptbl.JOB
    )

I hope that helps and makes sense.

NateSchneider
A: 

In Oracle I'd do it with an analytical function, so you'd only query the emp table once :

SELECT *
  FROM (SELECT e.*, MAX (sal) OVER () AS max_sal
          FROM scott.emp e
         WHERE deptno = 20 
           AND job = 'CLERK')
 WHERE sal = max_sal

It's simpler, easier to read and more efficient.

If you want to modify it to list list this information for all departments, then you'll need to use the "PARTITION BY" clause in OVER:

SELECT *
  FROM (SELECT e.*, MAX (sal) OVER (PARTITION BY deptno) AS max_sal
          FROM scott.emp e
         WHERE job = 'CLERK')
 WHERE sal = max_sal
ORDER BY deptno
Gabor Kecskemeti