views:

10503

answers:

3

Can someone please explain what the "partition by" keyword does and give a simple example of it in action, as well as why one would want to use it? I have a SQL query written by someone else and I'm trying to figure out what it does.

An example of partition by:

SELECT empno, deptno, COUNT(*) 
OVER (PARTITION BY deptno) DEPT_COUNT
FROM emp

The examples I've seen online seem a bit too in-depth.

Thanks in advance!

+17  A: 

The PARTITION BY clause sets the range of records that will be used for each "GROUP" within the OVER clause.

In your example SQL, DEPT_COUNT will return the number of employees within that department for every employee record. (It is as if your de-nomalising the emp table; you still return every record in the emp table.)

emp_no, dept_no, DEPT_COUNT
1, 10, 3
2, 10, 3
3, 10, 3 <- three because there are three "dept_no = 10" records.
4, 20, 2
5, 20, 2 <- two because there are two "dept_no = 20" records.

If there was another column (e.g., state) then you could count how many departments in that State.

It is like getting the results of a GROUP BY (SUM, AVG, etc.) without the aggregation of the result set.

It is useful when you use the LAST OVER or MIN OVER functions to get, for example, the lowest and highest salary in the department and then use that in a calulation against this records salary WITHOUT A SUB SELECT. Much faster.

Read the linked AskTom article for further details.

Hope this helps.

Guy
The 2nd explanation line @code should it say "two because there are two "dept_no = 20" records". Anyone can edit? :)
Camilo Díaz
Camilo, I've corrected the example. Thanks.
Guy
+4  A: 

It is the SQL extension called analytics. The "over" in the select statement tells oracle that the function is a analytical function, not a group by function. The advantage to using analytics is that you can collect sums, counts, and a lot more with just one pass through of the data instead of looping through the data with sub selects or worse, PL/SQL.

It does look confusing at first but this will be second nature quickly. No one explains it better then Tom Kyte. So the link above is great.

Of course, reading the documentation is a must.

+2  A: 
EMPNO     DEPTNO DEPT_COUNT


 7839         10          4
 5555         10          4
 7934         10          4
 7782         10          4 --- 4 records in table for dept 10
 7902         20          4
 7566         20          4
 7876         20          4
 7369         20          4 --- 4 records in table for dept 20
 7900         30          6
 7844         30          6
 7654         30          6
 7521         30          6
 7499         30          6
 7698         30          6 --- 6 records in table for dept 30

Here we are getting count for respective deptno. As for deptno 10 we have 4 records in table emp similar results for deptno 20 and 30 also.