Partition By Clause in Oracle

Swadhin Ray
CERTIFIED EXPERT
Published:
Updated:
Edited by: Andrew Leniart
An introductory discussion about how to use Partition by Clause in Oracle.

What is a "partition by" clause in Oracle?


It is used to break the data into small partitions and is been separated by a boundary or in simple dividing the input into logical groups. The analytical functions are performed within this partitions.  So when the boundaries are crossed then the function get restarted to segregate the data. The "partition by" clause is similar to the "GROUP BY" clause that is used in aggregate functions. They are also known as query partition clause in Oracle. 


The syntax of PARTITION BY CLAUSE:


# ANALYTIC FUNCTION or FUNCTION NAME () OVER ( PARTITION BY COLUMN NAME )


Any arguments that need to be passed to the functions then it has to be passed within the parenthesis and after which we have to write the keyword OVER where the partition by will be used.   The column name is used where we want to use clause. 


Example of PARTITION BY CLAUSE:


Let us try to identify the total or cumulative cost of each department when it comes to expenditure on salary.  To show this let us take the example from the Employee table which stores the data of the employee, department ID and salary etc.. 


select * from EMPLOYEES;

On above execution, we will see the below result on SQL developer. 



If we take the example for department number 30 then we should be getting  the total as "24900":


DEPARTMENT
SALARY
30
11000
30
2500
30
3100
30
2900
30
2800
30
2600
TOTAL/CUMULATIVE
24900


Below SQL will get the above result :


SQL> SELECT
department_id,
first_name,
last_name,
salary,
SUM(salary) OVER(
PARTITION BY department_id
) total_cost_of_department
FROM
employees
WHERE
department_id = 30
ORDER BY
department_id;


DEPARTMENT_ID FIRST_NAME           LAST_NAME                     SALARY TOTAL_COST_OF_DEPARTMENT
------------- -------------------- ------------------------- ---------- ------------------------
30 Den                  Raphaely                       11000                    24900
30 Alexander            Khoo                            3100                    24900
30 Shelli               Baida                           2900                    24900
30 Sigal                Tobias                          2800                    24900
30 Guy                  Himuro                          2600                    24900
30 Karen                Colmenares                      2500                    24900

6 rows selected. 


To understand what the boundaries are we have to run the same SQL without the where clause:


SQL> SELECT
department_id,
first_name,
last_name,
salary,
SUM(salary) OVER(
PARTITION BY department_id
) total_cost_of_department
FROM
employees
ORDER BY
department_id;

Output :

DEPARTMENT_ID FIRST_NAME           LAST_NAME                     SALARY TOTAL_COST_OF_DEPARTMENT
------------- -------------------- ------------------------- ---------- ------------------------
10 Jennifer             Whalen                          4400                     4400
20 Michael              Hartstein                      13000                    19000
20 Pat                  Fay                             6000                    19000
30 Den                  Raphaely                       11000                    24900
30 Karen                Colmenares                      2500                    24900
30 Alexander            Khoo                            3100                    24900
30 Shelli               Baida                           2900                    24900
30 Sigal                Tobias                          2800                    24900
30 Guy                  Himuro                          2600                    24900
40 Susan                Mavris                          6500                     6500
50 Donald               OConnell                        2600                   156400


From the above output we can see that the break happening for each department, so each department the functions starts again as the boundary is within each department we have to get the total salaries. The marked bold letters are the starting and ending boundaries for department 30 and then the function starts again as new boundary started for the next department i.e. 40. 


So now we understand how the input data are been broken into chunks using partition by a clause in Oracle, I would suggest having to look to another article that was published on Introduction to Analytic Functions as the starting point for how to do data analysis in Oracle database. 



Thank you for reading my article, please feel free to leave me some feedback or to suggest any future topics.

I'll be looking forward to hearing from you – Swadhin Ray (Sloba)


For more information about me, please check out my Experts Exchange Profile page.





0
71,148 Views
Swadhin Ray
CERTIFIED EXPERT

Comments (0)

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.