What is a "partition by" clause in Oracle?
It is used to break the data into small partitions and is been separated by a boundary or in simple dividing the input into logical groups. The analytical functions are performed within this partitions. So when the boundaries are crossed then the function get restarted to segregate the data. The "partition by" clause is similar to the "GROUP BY" clause that is used in aggregate functions. They are also known as query partition clause in Oracle.
The syntax of PARTITION BY CLAUSE:
# ANALYTIC FUNCTION or FUNCTION NAME () OVER ( PARTITION BY COLUMN NAME )
Any arguments that need to be passed to the functions then it has to be passed within the parenthesis and after which we have to write the keyword OVER where the partition by will be used. The column name is used where we want to use clause.
Example of PARTITION BY CLAUSE:
Let us try to identify the total or cumulative cost of each department when it comes to expenditure on salary. To show this let us take the example from the Employee table which stores the data of the employee, department ID and salary etc..
select * from EMPLOYEES;
On above execution, we will see the below result on SQL developer.
If we take the example for department number 30 then we should be getting the total as "24900":
DEPARTMENT |
SALARY |
30 |
11000 |
30 |
2500 |
30 |
3100 |
30 |
2900 |
30 |
2800 |
30 |
2600 |
TOTAL/CUMULATIVE |
24900 |
Below SQL will get the above result :
SQL> SELECT
department_id,
first_name,
last_name,
salary,
SUM(salary) OVER(
PARTITION BY department_id
) total_cost_of_department
FROM
employees
WHERE
department_id = 30
ORDER BY
department_id;
DEPARTMENT_ID FIRST_NAME LAST_NAME SALARY TOTAL_COST_OF_DEPARTMENT
------------- -------------------- ------------------------- ---------- ------------------------
30 Den Raphaely 11000 24900
30 Alexander Khoo 3100 24900
30 Shelli Baida 2900 24900
30 Sigal Tobias 2800 24900
30 Guy Himuro 2600 24900
30 Karen Colmenares 2500 24900
6 rows selected.
To understand what the boundaries are we have to run the same SQL without the where clause:
SQL> SELECT
department_id,
first_name,
last_name,
salary,
SUM(salary) OVER(
PARTITION BY department_id
) total_cost_of_department
FROM
employees
ORDER BY
department_id;
Output :
DEPARTMENT_ID FIRST_NAME LAST_NAME SALARY TOTAL_COST_OF_DEPARTMENT
------------- -------------------- ------------------------- ---------- ------------------------
10 Jennifer Whalen 4400 4400
20 Michael Hartstein 13000 19000
20 Pat Fay 6000 19000
30 Den Raphaely 11000 24900
30 Karen Colmenares 2500 24900
30 Alexander Khoo 3100 24900
30 Shelli Baida 2900 24900
30 Sigal Tobias 2800 24900
30 Guy Himuro 2600 24900
40 Susan Mavris 6500 6500
50 Donald OConnell 2600 156400
From the above output we can see that the break happening for each department, so each department the functions starts again as the boundary is within each department we have to get the total salaries. The marked bold letters are the starting and ending boundaries for department 30 and then the function starts again as new boundary started for the next department i.e. 40.
So now we understand how the input data are been broken into chunks using partition by a clause in Oracle, I would suggest having to look to another article that was published on Introduction to Analytic Functions as the starting point for how to do data analysis in Oracle database.
Thank you for reading my article, please feel free to leave me some feedback or to suggest any future topics.
I'll be looking forward to hearing from you – Swadhin Ray (Sloba)
For more information about me, please check out my Experts Exchange Profile page.
Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.
Comments (0)