views:

590

answers:

4

I'm new to working with analytic functions.

DEPT EMP   SALARY
---- ----- ------
  10 MARY  100000
  10 JOHN  200000
  10 SCOTT 300000
  20 BOB   100000
  20 BETTY 200000
  30 ALAN  100000
  30 TOM   200000
  30 JEFF  300000

I want the department and employee with minimum salary.

Results should look like:

DEPT EMP   SALARY
---- ----- ------
  10 MARY  100000
  20 BOB   100000
  30 ALAN  100000

EDIT: Here's the SQL I have (but of course, it doesn't work as it wants staff in the group by clause as well):

SELECT dept, 
  emp,
  MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY salary)
FROM mytable
GROUP BY dept
+3  A: 

You can use the RANK() syntax. For example, this query will tell you where an employee ranks within their department with regard to how large their salary is:

SELECT
  dept,
  emp,
  salary,
  (RANK() OVER (PARTITION BY dept ORDER BY salary)) salary_rank_within_dept
FROM EMPLOYEES

You could then query from this where salary_rank_within_dept = 1:

SELECT * FROM
  (
    SELECT
      dept,
      emp,
      salary,
      (RANK() OVER (PARTITION BY dept ORDER BY salary)) salary_rank_within_dept
    FROM EMPLOYEES
  )
WHERE salary_rank_within_dept = 1
Adam Paynter
Perfect! I didn't know about RANK() yet. Thanks.
Travis Heseman
I didn't even know about RANK() until yesterday! :)
Adam Paynter
I'm downvoting this for the reasons I outlined in my own answer: I think it's probably inefficient, and I think that the query is not a good match to the exact question being asked. I'm not saying that it won't give the correct answer, just that it doesn't express the logic of the question very well.
David Aldridge
@David: Pretty obvious from the answer timestamps that you lifted my answer while voting me down.
OMG Ponies
Your answer, rexem, doesn't use analytic functions. How did I "lift" it?
David Aldridge
Wow. You deleted my answer rexem? My completely different answer to yours? The one that used analytic functions that yours did not? The one that I put the effort into explaining why an analytic min() function was probably better than a Rank() function? Are you 15 years old, or too incompetent to see the difference?
David Aldridge
Reverted my answer.
David Aldridge
For the record, rexem edited my answer to say something like "I will not lift competitors ideas and downvote them". Competitors? Grow up
David Aldridge
MIN is an **aggregate** function, **requiring a GROUP BY statement unless you use PARTITION BY**.
OMG Ponies
@David: I'm not the one who lifts an answer **after** the question has already been marked, and then proceed to mark down everyone else because the comment *in* the answer isn't enough. **Utterly pathetic**.
OMG Ponies
David Aldridge
@David: So you're saying that AskTom is wrong: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2165134263446 ;) In the end, there's only one MIN function.
OMG Ponies
Which part of that AskTom posting do you think says that Min() is not both an aggregate function _and_ an analytic function depending on syntax? This question was tagged "Oracle" and "analytic-functions", and since you don't know either (your use of single-quotes on a table alias in your answer would raise ORA-00923 is plain evidence of your Oracle inexperience) then maybe you should have stayed clear of it. If your contention is that Min() is not an analytic function then read the manual and be enlightened. On a personal note, drop the professional ego and think next time.
David Aldridge
Oh, hilarious: rexem edited the tags on the original question so it says "aggregate-functions" instead of Travis' "analytic-functions". The question itself say "Analytic"! So to support your bruised ego rexem, you're willing to make people look like they can't put the correct tags on the question? Are you going to edit the actual question next so that it fits your answer? http://stackoverflow.com/revisions/1533240/list
David Aldridge
@rexem: I removed the plsql tag that you added (again) also. I think that Travis knows what plsql is, and that's why he didn't add it originally and that's why he took it off after you added it the first time. I think that you don't know what PL/SQL is, rexem.
David Aldridge
Dudes, if you want to have a flame war, get a blog. I thank everyone for helping me with my query. After reviewing David's solution, I've refactored to it and marked it as the answer because I believe it to be a better answer. I now urge everyone to move on.
Travis Heseman
A: 
select e2.dept, e2.emp, e2.salary
from employee e2
where e2.salary = (select min(e1.salary) from employee e1)
Chris R
That will give you one record - the minimum for the entire table. You need to group by the department in your subselect.
OMG Ponies
+2  A: 

I think that the Rank() function is not the way to go with this, for two reasons.

Firstly, it is probably less efficient than a Min()-based method.

The reason for this is that the query has to maintain an ordered list of all salaries per department as it scans the data, and the rank will then be assigned later by re-reading this list. Obviously in the absence of indexes that can be leveraged for this, you cannot assign a rank until the last data item has been read, and maintenance of the list is expensive.

So the performance of the Rank() function is dependent on the total number of elements to be scanned, and if the number is sufficient that the sort spills to disk then performance will collapse.

This is probably more efficient:

select dept,
       emp,
       salary
from
       (
       SELECT dept, 
              emp,
              salary,
              Min(salary) Over (Partition By dept) min_salary
       FROM   mytable
       )
where salary = min_salary
/

This method only requires that the query maintain a single value per department of the minimum value encountered so far. If a new minimum is encountered then the existing value is modified, otherwise the new value is discarded. The total number of elements that have to be held in memory is related to the number of departments, not the number of rows scanned.

It could be that Oracle has a code path to recognise that the Rank does not really need to be computed in this case, but I wouldn't bet on it.

The second reason for disliking Rank() is that it just answers the wrong question. The question is not "Which records have the salary that is the first ranking when the salaries per department are ascending ordered", it is "Which records have the salary that is the minimum per department". That makes a big difference to me, at least.

David Aldridge
Thank you David. After considering its benefits, I refactored to your solution.
Travis Heseman
+1  A: 

I think you were pretty close with your original query. The following would run and do match your test case:

SELECT dept, 
  MIN(emp) KEEP(DENSE_RANK FIRST ORDER BY salary, ROWID) AS emp,
  MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY salary, ROWID) AS salary
FROM mytable
GROUP BY dept

In contrast to the RANK() solutions, this one guarantees at most one row per department. But that hints at a problem: what happens in a department where there are two employees on the lowest salary? The RANK() solutions will return both employees -- more than one row for the department. This answer will pick one arbitrarily and make sure there's only one for the department.

William Rose
Yeah, that's a good point on the multiple records. The Min() method(s) will retrieve all the duplicates ... be trickier to get a single record back for those if one were needed.
David Aldridge