Community Pick: Many members of our community have endorsed this article.

Subquery Factoring WITH Oracle

Published:
Updated:
WITH Clause OR Subquery Factoring WITH Oracle

by Ivo Stoykov


Digging information from a database sometimes requires creating complicated queries. Often a simple select * from . . . is nested into another one and both into another one and so forth until the result becomes a very complicated query containing scalar queries (subqueries in the select clause) and in-line views (subqueries in the from clause). Such queries are difficult to read, hard to maintain, and a nightmare to optimize.

Fortunately there is a WITH clause defined in SQL-99 and implemented in Oracle9 release 2 and later. In Oracle database the WITH clause is used for materializing subqueries to avoid recomputing them multiple times without using temporary tables. The WITH clause allows factor out a sub-query, name it, and then reference it by name multiple times within the original complex query.

Additionally this technique lets the optimizer choose how to deal with the sub-query results -- whether to create a temporary table or inline it as a view.


The Syntax

WITH
                       alias_name         -- Alias to use in the main query
                      AS
                       (insert a query here)
                      SELECT...            -- Beginning of the query main body
                      Multiple aliases can be defined in the WITH clause:
                      WITH
                       alias_name1
                      AS
                       (query1)
                       aleas_name2
                      AS
                       (query2)
                      ...
                      SELECT...

Open in new window

Let's look inside by taking a simple query:

SQL> col dummy format a10;
                      SQL> select dummy from dual;
                       
                      DUMMY
                      ----------
                      X
                      SQL>

Open in new window

Same re-written as WITH clause will be:

SQL> with d as (select * from dual)
                        2  select * from d;
                       
                      DUMMY
                      ----------
                      X
                      SQL>

Open in new window

Even if it seems nonsense at first glance, using WITH clause in complicated queries could be very worthy.

Let’s consider usage of WITH clause as creating a temporary table called “d” and then selecting from it. (Oracle actually does not create table but merge SQL before execution.)  In other words WITH clause allows us to name a predefined Select statement in the context of a bigger Select and referenced in later by the given name.

Consider next few samples (Please scroll when necessary):

SQL> with temp_t1 as ( select dummy c1 from dual 
                        2  select * from temp_t1 a,temp_t1 b
                        3  /
                       
                      C1 C1
                      -- --
                      X  X
                      
                      SQL> with temp_t1 as (select dummy c1 from dual)
                        2    ,temp_t2 as (select dummy c1 from dual)
                        3  select * from temp_t1 a,temp_t2 b
                        4  /
                       
                      C1 C1
                      -- --
                      X  X
                      
                      SQL> with temp_t1 as ( select dummy c1 from dual )
                        2      ,temp_t2 as ( select dummy c1 from dual a,temp_t1 b where b.c1 = a.dummy )
                        3  select * from temp_t2 a
                        4  /
                       
                      C1
                      --
                      X
                      
                      SQL> select * from temp_t2 a;
                       
                      select * from temp_t2 a
                       
                      ORA-00942: table or view does not exist

Open in new window

So nevertheless we can select the recordset as a table it exists actually only during the execution.


What Can We Do WITH

Samples above shows that we can do following WITH:
1.      Reference a named query is allowed any number of times.
2.      Any number of named queries are allowed.
3.      Named queries can reference other named queries that came before them and even correlate to previous named queries.
4.      Named queries has a local scope to the SELECT in which they are defined.


Benefits

The obvious is the ability to create reusable construction inside a SELECT, name it and reuse it whenever necessary.


Referencing the Same Sub-query Multiple Times

The following query joins two tables and computes the aggregate SUM(SAL) more than once. The bold text represents the parts of the query that are repeated.

SELECT dname, SUM(sal) AS dept_total FROM emp, dept
                       WHERE emp.deptno = dept.deptno
                       GROUP BY dname HAVING
                             SUM(sal) > ( SELECT SUM(sal) * 1/3 FROM emp, dept
                                           WHERE emp.deptno = dept.deptno)
                       ORDER BY SUM(sal) DESC;

Open in new window


You can improve the query by doing the subquery once, and referencing it at the appropriate points in the main query. The bold text represents the common parts of the subquery, and the places where the subquery is referenced.

WITH summary AS
                      (
                       SELECT dname, SUM(sal) AS dept_total FROM emp, dept
                       WHERE emp.deptno = dept.deptno
                       GROUP BY dname
                      )
                      SELECT dname, dept_total FROM summary 
                       WHERE dept_total > (SELECT SUM(dept_total) * 1/3 FROM summary)
                       ORDER BY dept_total DESC;

Open in new window



What Stats Say

When a question comes to optimization using WITH clause, the answer is – depends. This is because many queries will be treated by Oracle as an in-line view. Sometimes recordset will be “materialized” into “on-the-fly” temporary table for use in later parts of the only this SQL statement. (This means that following execution of the same cursor will result in a new temp table each time.)

Let’s see an example. First we’ll create a test table with more than 1 million rows:

CREATE TABLE TestTbl
                      NOLOGGING
                      AS 
                      SELECT A1.OWNER       AS usr
                            ,A1.OBJECT_TYPE AS PROD
                            ,A1.OBJECT_ID   AS ITEMS
                        FROM ALL_OBJECTS A1, EMP A2, DEPT A3;
                      
                      SQL> select count(*) as recs from testtbl;  
                       
                            RECS
                      ----------
                         3140928
                      
                      SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER,'TESTTBL');
                       
                      PL/SQL procedure successfully completed
                       
                      SQL>

Open in new window


Next we’ll use a standard approach to show up some reports.

set serveroutput on;
                      set autotrace on;
                      set timing on;
                      
                      SQL> SELECT prod
                        2        ,total_items
                        3    FROM  (
                        4           SELECT prod
                        5           ,      NVL(SUM(items),0) AS total_items
                        6           FROM   testtbl
                        7           GROUP  BY
                        8                  prod
                        9          ) ilv
                       10    WHERE total_items > (SELECT SUM(items)/3 AS one_third_items
                       11*                        FROM   testtbl)
                      SQL> /
                      
                      PROD                TOTAL_ITEMS            
                      ------------------- ---------------------- 
                      SYNONYM             50068277560            
                      JAVA CLASS          43610562520            
                      
                      
                      Elapsed: 00:00:03.07
                      
                      Execution Plan
                      ----------------------------------------------------------
                      Plan hash value: 2449639557
                      
                      -------------------------------------------------------------------------------
                      | Id  | Operation           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
                      -------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT    |         |     2 |    28 |  3154   (8)| 00:00:38 |
                      |*  1 |  FILTER             |         |       |       |            |          |
                      |   2 |   HASH GROUP BY     |         |     2 |    28 |  3154   (8)| 00:00:38 |
                      |   3 |    TABLE ACCESS FULL| TESTTBL |  3140K|    41M|  2967   (2)| 00:00:36 |
                      |   4 |   SORT AGGREGATE    |         |     1 |     5 |            |          |
                      |   5 |    TABLE ACCESS FULL| TESTTBL |  3140K|    14M|  2967   (2)| 00:00:36 |
                      -------------------------------------------------------------------------------
                      
                      Predicate Information (identified by operation id):
                      ---------------------------------------------------
                      
                         1 - filter(NVL(SUM("ITEMS"),0)> (SELECT SUM("ITEMS")/3 FROM
                                    "TESTTBL" "TESTTBL"))
                      
                      
                      Statistics
                      ----------------------------------------------------------
                                1  recursive calls
                                0  db block gets
                            21319  consistent gets
                            21213  physical reads
                                0  redo size
                              536  bytes sent via SQL*Net to client
                              381  bytes received via SQL*Net from client
                                2  SQL*Net roundtrips to/from client
                                0  sorts (memory)
                                0  sorts (disk)
                                2  rows processed
                      
                      SQL>

Open in new window


As visible, resources are moderate and execution time is about 3 sec. The problem is that to get the result, table is accessed twice. To avoid this we could refactor the query using WITH clause. As was said above there will be a single access due to materialization of the recordset.

SQL> WITH ITEMS_REP AS (
                        2       SELECT PROD, NVL(SUM(ITEMS),0) AS TOTAL_ITMES
                        3         FROM TESTTBL
                        4        GROUP BY PROD)
                        5    SELECT PROD, TOTAL_ITMES
                        6      FROM ITEMS_REP
                        7     WHERE TOTAL_ITMES > (SELECT SUM(TOTAL_ITMES)/3 AS ONE_THIRD_ITEMS
                        8                            FROM   ITEMS_REP)
                        9/
                      SQL>
                      
                      PROD                TOTAL_ITMES
                      ------------------- -----------
                      SYNONYM             50068277560            
                      JAVA CLASS          43610562520            
                      
                      Elapsed: 00:00:02.15
                      
                      Execution Plan
                      ----------------------------------------------------------
                      Plan hash value: 2967744285
                      
                      -------------------------------------------------------------------------------------------------------
                      | Id  | Operation                  | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
                      -------------------------------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT           |                          |    22 |   528 |  3158   (8)| 00:00:38 |
                      |   1 |  TEMP TABLE TRANSFORMATION |                          |       |       |            |          |
                      |   2 |   LOAD AS SELECT           |                          |       |       |            |          |
                      |   3 |    HASH GROUP BY           |                          |    22 |   308 |  3154   (8)| 00:00:38 |
                      |   4 |     TABLE ACCESS FULL      | TESTTBL                  |  3140K|    41M|  2967   (2)| 00:00:36 |
                      |*  5 |   VIEW                     |                          |    22 |   528 |     2   (0)| 00:00:01 |
                      |   6 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6633_E00F1 |    22 |   308 |     2   (0)| 00:00:01 |
                      |   7 |    SORT AGGREGATE          |                          |     1 |    13 |            |          |
                      |   8 |     VIEW                   |                          |    22 |   286 |     2   (0)| 00:00:01 |
                      |   9 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6633_E00F1 |    22 |   308 |     2   (0)| 00:00:01 |
                      -------------------------------------------------------------------------------------------------------
                      
                      
                      Predicate Information (identified by operation id):
                      ---------------------------------------------------
                      
                         5 - filter("TOTAL_ITMES"> (SELECT SUM("TOTAL_ITMES")/3 FROM  (SELECT /*+ CACHE_TEMP_TABLE
                                    ("T1") */ "C0" "PROD","C1" "TOTAL_ITMES" FROM "SYS"."SYS_TEMP_0FD9D6633_E00F1" "T1")
                                    "ITEMS_REP"))
                      
                      
                      Statistics
                      ----------------------------------------------------------
                                2  recursive calls
                                9  db block gets
                            10672  consistent gets
                            10613  physical reads
                              644  redo size
                              536  bytes sent via SQL*Net to client
                              381  bytes received via SQL*Net from client
                                2  SQL*Net roundtrips to/from client
                                0  sorts (memory)
                                0  sorts (disk)
                                2  rows processed

Open in new window


Now the query goes faster. Time is reduced by 1 sec. We have hard-parse in addition to the cost of setting up a temp table. From statistics we can see that the physical reads are significantly reduced.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'SQ_FCT' for
                        2    WITH ITEMS_REP AS (
                        3      SELECT PROD, NVL(SUM(ITEMS),0) AS TOTAL_ITMES
                        4        FROM TESTTBL
                        5       GROUP BY PROD)
                        6    SELECT PROD, TOTAL_ITMES
                        7      FROM ITEMS_REP
                        8     WHERE TOTAL_ITMES > (SELECT SUM(TOTAL_ITMES)/3 AS ONE_THIRD_ITEMS
                        9                            FROM   ITEMS_REP);
                      SQL>
                      SQL> @?RDBMS\ADMIN\utlxplp.sql
                      
                      PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                            
                      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
                      Plan hash value: 395517267                                                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                                                                                   
                      -------------------------------------------------------------------------------------------------------                                                                                                                                                                                                      
                      | Id  | Operation                  | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |                                                                                                                                                                                                      
                      -------------------------------------------------------------------------------------------------------                                                                                                                                                                                                      
                      |   0 | SELECT STATEMENT           |                          |    22 |   528 |  3158   (8)| 00:00:38 |                                                                                                                                                                                                      
                      |   1 |  TEMP TABLE TRANSFORMATION |                          |       |       |            |          |                                                                                                                                                                                                      
                      |   2 |   LOAD AS SELECT           |                          |       |       |            |          |                                                                                                                                                                                                      
                      |   3 |    HASH GROUP BY           |                          |    22 |   308 |  3154   (8)| 00:00:38 |                                                                                                                                                                                                      
                      |   4 |     TABLE ACCESS FULL      | TESTTBL                  |  3140K|    41M|  2967   (2)| 00:00:36 |                                                                                                                                                                                                      
                      |*  5 |   VIEW                     |                          |    22 |   528 |     2   (0)| 00:00:01 |                                                                                                                                                                                                      
                      |   6 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6637_E00F1 |    22 |   308 |     2   (0)| 00:00:01 |                                                                                                                                                                                                      
                      |   7 |    SORT AGGREGATE          |                          |     1 |    13 |            |          |                                                                                                                                                                                                      
                      |   8 |     VIEW                   |                          |    22 |   286 |     2   (0)| 00:00:01 |                                                                                                                                                                                                      
                      |   9 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6637_E00F1 |    22 |   308 |     2   (0)| 00:00:01 |                                                                                                                                                                                                      
                      -------------------------------------------------------------------------------------------------------

Open in new window


The execution plan above shows the new TEMP TABLE TRANSFORMATION that Oracle has introduced (Step 2). There is a global temporary table Oracle created and loaded from the first scan of TESTTBL. This temporary dataset is then used to answer the overall question of which product items are more than one-third (Step 7).

Let’s repeat that Oracle will not always materialise subqueries in this way. Sometimes it will either merge the subquery into the main query or treat it as a simple in-line view.  In cases when CBO chooses not to materialise subquery, but we do want to, Oracle supports the MATERIALIZE hint.

Without hint we see:

SQL> set autotrace traceonly explain
                      SQL>
                      SQL> WITH INLINE_VIEW AS (
                        2       SELECT PROD
                        3       ,      SUM(ITEMS) AS TOTAL_ITEMS
                        4       FROM   testtbl
                        5    GROUP BY PROD
                        6       )
                        7    SELECT *
                        8    FROM   inline_view;
                      
                      Execution Plan
                      ----------------------------------------------------------
                      Plan hash value: 1165785812
                      
                      ------------------------------------------------------------------------------
                      | Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
                      ------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT   |         |    22 |   308 |  3154   (8)| 00:00:38 |
                      |   1 |  HASH GROUP BY     |         |    22 |   308 |  3154   (8)| 00:00:38 |
                      |   2 |   TABLE ACCESS FULL| TESTTBL |  3140K|    41M|  2967   (2)| 00:00:36 |
                      ------------------------------------------------------------------------------
                      
                      SQL>

Open in new window


Plan shows us that Oracle has chosen not to materialise this subquery because it is used once and temporary table would be a waste of resources.
Adding the MATERIALIZE hint as follows shows following:

SQL> WITH INLINE_VIEW AS (
                        2         SELECT /*+ materialize */
                        3                PROD, SUM(ITEMS) AS TOTAL_ITEMS
                        4         FROM   testtbl
                        5      GROUP BY PROD
                        6         )
                        7      SELECT *
                        8      FROM   inline_view;
                      
                      Execution Plan
                      ----------------------------------------------------------
                      Plan hash value: 844976587
                      
                      -------------------------------------------------------------------------------------------------------
                      | Id  | Operation                  | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
                      -------------------------------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT           |                          |    22 |   528 |  3156   (8)| 00:00:38 |
                      |   1 |  TEMP TABLE TRANSFORMATION |                          |       |       |            |          |
                      |   2 |   LOAD AS SELECT           |                          |       |       |            |          |
                      |   3 |    HASH GROUP BY           |                          |    22 |   308 |  3154   (8)| 00:00:38 |
                      |   4 |     TABLE ACCESS FULL      | TESTTBL                  |  3140K|    41M|  2967   (2)| 00:00:36 | 
                      |   5 |   VIEW                     |                          |    22 |   528 |     2   (0)| 00:00:01 |
                      |   6 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6639_E00F1 |    22 |   308 |     2   (0)| 00:00:01 |
                      -------------------------------------------------------------------------------------------------------
                      
                      SQL>

Open in new window


The temporary table is present in the plan.

On large datasets, however, the savings from such optimisation can be quite. On tiny datasets, the time involved in the temporary table setup can take longer than the original query itself so is not a particularly useful mechanism.


Conclusion

As WITH clause is refactoring of a SELECT statement, it could be used anywhere where SELECT is acceptable, including in DML operations as INSERT, UPDATE and DELETE.

INSERT INTO table_name
                      WITH subquery_name AS (SELECT . . .)
                      SELECT ... FROM   subquery_name;
                      
                      UPDATE table_name
                      SET    column_name = ( WITH subquery_name AS (SELECT ...)
                                             SELECT ... FROM   subquery_name );
                      
                      DELETE FROM table_name
                      WHERE  column_name IN ( WITH subquery_name AS (SELECT ...)
                                              SELECT ... FROM   subquery_name );

Open in new window



Restrictions on Subquery Factoring:

•      WITH clause cannot be nested. That is, WITH cannot exists in another WITH. However, the name defined in WITH clause can be used in the subquery of any subsequent subquery.
•      In a query with set operators, the set operator subquery cannot contain the WITH clause, but the FROM subquery can.


Difference between 11g and previous versions

Named query must be used in the SQL statement prior to Oracle 11g. It must appear in another named query (like above) or in the main SELECT statement (like the first example). Otherwise an error is raised.

If we fire following script against version less than 11g

SELECT * FROM v$version;
                      WITH T1 AS (SELECT * FROM DUAL),
                           T2 AS (SELECT * FROM DUAL)
                      select * from t1;

Open in new window


Result will be an error:

BANNER                                                           
                      ---------------------------------------------------------------- 
                      Oracle9i Enterprise Edition Release 9.2.0.7.0 - Production       
                      PL/SQL Release 9.2.0.7.0 - Production                            
                      CORE	9.2.0.7.0	Production                                          
                      TNS for 32-bit Windows: Version 9.2.0.7.0 - Production           
                      NLSRTL Version 9.2.0.7.0 - Production                            
                      
                      
                      Error starting at line 2 in command:
                      WITH T1 AS (SELECT * FROM DUAL),
                           T2 AS (SELECT * FROM DUAL)
                      select * from t1
                      Error at Command Line:4 Column:14
                      Error report:
                      SQL Error: ORA-32035: unreferenced query name defined in WITH clause
                      32035. 00000 -  "unreferenced query name defined in WITH clause"
                      *Cause:    There is at least one WITH clause query name that is not
                                 referenced in any place.
                      *Action:   remove the unreferenced query name and retry

Open in new window


In Oracle 11g this is no more a problem

SELECT * FROM v$version;
                      WITH T1 AS (SELECT * FROM DUAL),
                           T2 AS (SELECT * FROM DUAL)
                      select * from t1;

Open in new window


Result is

BANNER                                                                           
                      --------------------------------------------------------------------------- 
                      Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production           
                      PL/SQL Release 11.1.0.7.0 - Production                                           
                      CORE	11.1.0.7.0	Production                                                         
                      TNS for 32-bit Windows: Version 11.1.0.7.0 - Production                          
                      NLSRTL Version 11.1.0.7.0 - Production                                           
                      
                      DUMMY 
                      ----- 
                      X     

Open in new window



Additional reading

A good discussion about WITH clause at AskTom: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:4423923392083
4
25,028 Views

Comments (1)

Devinder Singh VirdiLead Oracle DBA Team

Commented:
Good Article, I agree everybody is using nested inline view instead of WITH. (Even I do sometime).

Suggestion: Can you include Flushing of Shared_Pool and buffer_cache in order to reflect more acurate result for Consistent and Physical IO also?

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.