Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

ORACLE SQL - please explain OVER PARTITION BY

Posted on 2006-03-29
7
Medium Priority
?
73,694 Views
Last Modified: 2011-08-18
What does the OVER PARTITION BY do? I don't understand this SELECT statement and the OVER PARTITION BY.


SELECT                          
    Field1, field2,
    100 * field3/
        SUM(field3)
OVER ( PARTITION BY NULL) field4
FROM
0
Comment
Question by:joekeri
  • 3
  • 2
  • 2
7 Comments
 
LVL 23

Expert Comment

by:paquicuba
ID: 16321847
In the above case PARTITION is useless ( SUM(field3) OVER () field4 --The same) . Field4 is getting populated with the total sum of field3 all way across, see the following example:

SCOTT@PROD > select empno, ename, sum(sal) over ( partition by null) as total_sal from emp;

     EMPNO ENAME       TOTAL_SAL
---------- ---------- ----------
      7369 SMITH           29025
      7499 ALLEN           29025
      7521 WARD            29025
      7566 JONES           29025
      7654 MARTIN          29025
      7698 BLAKE           29025
      7934 MILLER          29025
      7788 SCOTT           29025
      7839 KING            29025
      7844 TURNER          29025
      7876 ADAMS           29025
      7900 JAMES           29025
      7902 FORD            29025
      7782 CLARK           29025

14 rows selected.

Elapsed: 00:00:00.04
SCOTT@PROD > select empno, ename, sum(sal) over () as total_sal from emp;          

     EMPNO ENAME       TOTAL_SAL
---------- ---------- ----------
      7369 SMITH           29025
      7499 ALLEN           29025
      7521 WARD            29025
      7566 JONES           29025
      7654 MARTIN          29025
      7698 BLAKE           29025
      7782 CLARK           29025
      7788 SCOTT           29025
      7839 KING            29025
      7844 TURNER          29025
      7876 ADAMS           29025
      7900 JAMES           29025
      7902 FORD            29025
      7934 MILLER          29025

14 rows selected.
0
 
LVL 23

Expert Comment

by:paquicuba
ID: 16321868
Now, if I want to create windows and run totals for different jobs I would PARTITION BY job:

SCOTT@PROD > select job, empno, ename, sum(sal) over (partition by job) as total_sal from emp;

JOB            EMPNO ENAME       TOTAL_SAL
--------- ---------- ---------- ----------
ANALYST         7788 SCOTT            6000
ANALYST         7902 FORD             6000
CLERK           7934 MILLER           4150
CLERK           7900 JAMES            4150
CLERK           7369 SMITH            4150
CLERK           7876 ADAMS            4150
MANAGER         7698 BLAKE            8275
MANAGER         7566 JONES            8275
MANAGER         7782 CLARK            8275
PRESIDENT       7839 KING             5000
SALESMAN        7844 TURNER           5600
SALESMAN        7654 MARTIN           5600
SALESMAN        7521 WARD             5600
SALESMAN        7499 ALLEN            5600

14 rows selected.

Elapsed: 00:00:00.03
0
 

Author Comment

by:joekeri
ID: 16321906
So,, from what you are saying is that OVER PARTITION BY I get that it is similar to GROUP BY... Is that correct?
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
LVL 23

Expert Comment

by:paquicuba
ID: 16322054
Kind of, but No.

In the example below, when you group by JOB, you have to limit the number of columns and restrict the number of rows displayed in order to obtain the TOTAL SAL for the different jobs:

SCOTT@PROD > select job, sum(sal) as total_sal from emp group by job order by 1;

JOB        TOTAL_SAL
--------- ----------
ANALYST         6000
CLERK           4150
MANAGER         8275
PRESIDENT       5000
SALESMAN        5600


On the other hand, using partition allows me to display all columns and rows and still obtain a TOTAL SAL:

SCOTT@PROD > select job, empno, ename, sum(sal) over (partition by job) as total_sal from emp;

JOB            EMPNO ENAME       TOTAL_SAL
--------- ---------- ---------- ----------
ANALYST         7788 SCOTT            6000
ANALYST         7902 FORD             6000
CLERK           7934 MILLER           4150
CLERK           7900 JAMES            4150
CLERK           7369 SMITH            4150
CLERK           7876 ADAMS            4150
MANAGER         7698 BLAKE            8275
MANAGER         7566 JONES            8275
MANAGER         7782 CLARK            8275
PRESIDENT       7839 KING             5000
SALESMAN        7844 TURNER           5600
SALESMAN        7654 MARTIN           5600
SALESMAN        7521 WARD             5600
SALESMAN        7499 ALLEN            5600


I can add as many columns I want without affecting the result:

SCOTT@PROD > select job, empno, ename, sum(sal) over (partition by job) as total_sal, deptno from emp;

JOB            EMPNO ENAME       TOTAL_SAL     DEPTNO
--------- ---------- ---------- ---------- ----------
ANALYST         7788 SCOTT            6000         20
ANALYST         7902 FORD             6000         20
CLERK           7934 MILLER           4150         10
CLERK           7900 JAMES            4150         30
CLERK           7369 SMITH            4150         20
CLERK           7876 ADAMS            4150         20
MANAGER         7698 BLAKE            8275         30
MANAGER         7566 JONES            8275         20
MANAGER         7782 CLARK            8275         10
PRESIDENT       7839 KING             5000         10
SALESMAN        7844 TURNER           5600         30
SALESMAN        7654 MARTIN           5600         30
SALESMAN        7521 WARD             5600         30
SALESMAN        7499 ALLEN            5600         30

14 rows selected.

Elapsed: 00:00:00.00
0
 
LVL 16

Accepted Solution

by:
RCorfman earned 200 total points
ID: 16322090
This is the syntax for Oracle Analytics.

Basically, a normal query is run and the results are retrieves intanally by the Database Engine, then the Analytics are applied to the results set and the Analytic function columns are computed.
There are several functions, Sum, Min, Max, Rank, Dense_rank, count, etc.

They sound similar to the agregate functions, but agregate function are either applied to every record in the results (without group by clause), or to groups or records. In eihter case, they reduce the number or rows returned with a normal aggregate function.

With the analytic functions, the number of rows returned is not reduced.  You can tell is is an analytic function, not a group by function, by the OVER keyword.  Over is followed by a 'windowing clause'. This is what is included inside the ( ).  The 'partition by' portion of a windowing clause works similar to the group by, it determines what portion of the result set to apply each analytic function to.  For some analytic function, the windowing clause must have an order by also.

Here are a couple examples:
SQL> select * from udttest;

IP           DEST             LINENO
------------ ------------ ----------
AAA          BBB                   1
AAA          CCC                   2
AAA          DDD                   3
AAA          EEE                   4
AAA          DDD                  -1
AAA          DDD                  -3
AAA          HHH

7 rows selected.

SQL> -- normal aggregate function
SQL> select DEST,count(*) from udttest group by DEST;

DEST           COUNT(*)
------------ ----------
BBB                   1
CCC                   1
DDD                   3
EEE                   1
HHH                   1

SQL> --- analytic count function - notice all rows are returned still
SQL> select DEST,count(*) over (partition by DEST) from udttest;

DEST         COUNT(*)OVER(PARTITIONBYDEST)
------------ -----------------------------
BBB                                      1
CCC                                      1
DDD                                      3
DDD                                      3
DDD                                      3
EEE                                      1
HHH                                      1

7 rows selected.

SQL> -- analytic sum function
SQL> select ip,dest,lineno,sum(lineno) over (partition by dest) sum_line,
  2                        sum(lineno) over (partition by ip) sum_ip
  3    from udttest;

IP           DEST             LINENO   SUM_LINE     SUM_IP
------------ ------------ ---------- ---------- ----------
AAA          BBB                   1          1          6
AAA          CCC                   2          2          6
AAA          DDD                   3         -1          6
AAA          DDD                  -3         -1          6
AAA          DDD                  -1         -1          6
AAA          EEE                   4          4          6
AAA          HHH                                         6

7 rows selected.

SQL> -- another example is using rank
SQL> select ip, dest,lineno,
  2     rank() over (partition by dest order by lineno) dest_rank_line
  3*   from udttest;

IP           DEST             LINENO DEST_RANK_LINE
------------ ------------ ---------- --------------
AAA          BBB                   1              1
AAA          CCC                   2              1
AAA          DDD                  -3              1
AAA          DDD                  -1              2
AAA          DDD                   3              3
AAA          EEE                   4              1
AAA          HHH                                  1

7 rows selected.

SQL> -- and the same, but we will order by lineno...
SQL> ---   notice the column values don't change, just the order as expected
SQL> select ip, dest,lineno,
  2     rank() over (partition by dest order by lineno) dest_rank_line
  3    from udttest order by lineno;

IP           DEST             LINENO DEST_RANK_LINE
------------ ------------ ---------- --------------
AAA          DDD                  -3              1
AAA          DDD                  -1              2
AAA          BBB                   1              1
AAA          CCC                   2              1
AAA          DDD                   3              3
AAA          EEE                   4              1
AAA          HHH                                  1

7 rows selected.

SQL>
SQL> -- this can be good for 'top N' queries
SQL> -- For instance, to get the top 3 records by lineno, we use a nested query
SQL> select * from (
  2    select ip,dest,lineno,
  3       rank() over (partition by ip order by lineno desc nulls last) rank
  4      from udttest
  5*  ) where rank <= 3;

IP           DEST             LINENO       RANK
------------ ------------ ---------- ----------
AAA          EEE                   4          1
AAA          DDD                   3          2
AAA          CCC                   2          3

SQL>
0
 

Author Comment

by:joekeri
ID: 16322137
thanks for the information. it clarified it for me...
0
 
LVL 16

Expert Comment

by:RCorfman
ID: 16323721
paquicuba, sorry, we cross-posted to some extent. I had typed my explanation and was running scripts to show the example. I didn't see that you'd already covered some of what I did by the time I actually posted...
0

Featured Post

Get your Conversational Ransomware Defense e‑book

This e-book gives you an insight into the ransomware threat and reviews the fundamentals of top-notch ransomware preparedness and recovery. To help you protect yourself and your organization. The initial infection may be inevitable, so the best protection is to be fully prepared.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Working with Network Access Control Lists in Oracle 11g (part 1) Part 2: http://www.e-e.com/A_9074.html So, you upgraded to a shiny new 11g database and all of a sudden every program that used UTL_MAIL, UTL_SMTP, UTL_TCP, UTL_HTTP or any oth…
Note: this article covers simple compression. Oracle introduced in version 11g release 2 a new feature called Advanced Compression which is not covered here. General principle of Oracle compression Oracle compression is a way of reducing the d…
This video shows how to Export data from an Oracle database using the Datapump Export Utility.  The corresponding Datapump Import utility is also discussed and demonstrated.
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

824 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question