Community Pick: Many members of our community have endorsed this article.

Connect by prior

by Ivo Stoykov

One of challenges in data digging is hierarchical data representation. Usually manipulating trees in relational database raises lots of questions and confusions. In following lines I’ll try to scatter the fog around this topic. Fortunately Oracle offer  tree extensions (CONNECT BY ... PRIOR).

So let start with canonical example of emp table in Scott schema. (sample schemas in 11g are locked by default.
In this case you must unlock it fist by executing ALTER USER "SCOTT" IDENTIFIED BY "tiger" ACCOUNT UNLOCK.
Be aware that in 11g password is case sensitive).

SQL> select e.empno, e.ename, e.mgr, level from emp e
                        2> start with e.mgr is null
                        3> connect by prior e.empno = e.mgr;
                      EMPNO ENAME        MGR      LEVEL
                      ----- ---------- ----- ----------
                       7839 KING                      1
                       7566 JONES       7839          2
                       7788 SCOTT       7566          3
                       7876 ADAMS       7788          4
                       7902 FORD        7566          3
                       7369 SMITH       7902          4
                       7698 BLAKE       7839          2
                       7499 ALLEN       7698          3
                       7521 WARD        7698          3
                       7654 MARTIN      7698          3
                       7844 TURNER      7698          3
                       7900 JAMES       7698          3
                       7782 CLARK       7839          2
                       7934 MILLER      7782          3
                      14 rows selected

Open in new window

First to note here is a pseudo column named "level" (similar to rownum) available with connect by clause showing
where in the hierarchy is current row. We could use it to show the hierarchy in list more visible.
SQL> SELECT LEVEL, lpad(' ', LEVEL*2) || ename ename
                        2    FROM emp
                        3   START WITH mgr IS NULL
                        4  CONNECT BY PRIOR empno = mgr;
                           LEVEL ENAME
                      ---------- -------------------------
                               1   KING
                               2     JONES
                               3       SCOTT
                               4         ADAMS
                               3       FORD
                               4         SMITH
                               2     BLAKE
                               3       ALLEN
                               3       WARD
                               3       MARTIN
                               3       TURNER
                               3       JAMES
                               2     CLARK
                               3       MILLER
                      14 rows selected

Open in new window

Next is START WITH which specifies the root row(s) of the hierarchy (we defined it as e.mgr is null – i.e. the Big Boss does not have a Boss) and CONNECT BY specifying the relationship between parent and child rows.
CONNECT BY requires a simple or compound condition, which must be qualified with the PRIOR operator referring to the parent row. Syntax of condition is:
... PRIOR child_expression = parent_expression or
                      ... parent_expression = PRIOR child_expression

Open in new window

PRIOR is a unary operator like arithmetic operators. Multiple PRIOR conditions are acceptable.
But if a condition is compound then only one PRIOR operator is required. In general it is used with equal sign (=) although other operators could be used (‘theoretically’ as Oracle documentation states). Be aware though that non equal operator might cause infinite loop. If this is the case Oracle will return an error in runtime.

Since version 9i, Oracle introduced the SYS_CONNECT_BY_PATH function allowing  listing hierarchical elements starting from current point. It accepts two parameters: 1st is the column name and 2nd is delimiter char.
Let's change the previous query, replacing column name "ename" with this function and using slash ("/") we get:
SQL> SELECT level,sys_connect_by_path(ename,'/') path
                        2    FROM emp
                        3   START WITH mgr IS NULL
                        4  CONNECT BY PRIOR empno = mgr;
                           LEVEL PATH
                      ---------- --------------------------------------------------------------------------------
                               1 /KING
                               2 /KING/JONES
                               3 /KING/JONES/SCOTT
                               4 /KING/JONES/SCOTT/ADAMS
                               3 /KING/JONES/FORD
                               4 /KING/JONES/FORD/SMITH
                               2 /KING/BLAKE
                               3 /KING/BLAKE/ALLEN
                               3 /KING/BLAKE/WARD
                               3 /KING/BLAKE/MARTIN
                               3 /KING/BLAKE/TURNER
                               3 /KING/BLAKE/JAMES
                               2 /KING/CLARK
                               3 /KING/CLARK/MILLER
                      14 rows selected

Open in new window

Now we could directly see who is supervised by whom (from right to left – lowest to highest position).

Next Oracle version - 10g - adds new features in hierarchy: CONNECT_BY_ISLEAF and CONNECT_BY_ROOT.

CONNECT_BY_ISLEAF is a pseudo column and determines whether the current row is a leaf.
Values are 1 for a leaf and 0 for a branch (parent).
SQL> SELECT CONNECT_BY_ISLEAF , level,sys_connect_by_path(ename,'/') path
                        2    FROM emp
                        3   START WITH mgr IS NULL
                        4  CONNECT BY PRIOR empno = mgr;
                      CONNECT_BY_ISLEAF      LEVEL PATH
                      ----------------- ---------- ----------------------------------------------
                                      0          1 /KING
                                      0          2 /KING/JONES
                                      0          3 /KING/JONES/SCOTT
                                      1          4 /KING/JONES/SCOTT/ADAMS
                                      0          3 /KING/JONES/FORD
                                      1          4 /KING/JONES/FORD/SMITH
                                      0          2 /KING/BLAKE
                                      1          3 /KING/BLAKE/ALLEN
                                      1          3 /KING/BLAKE/WARD
                                      1          3 /KING/BLAKE/MARTIN
                                      1          3 /KING/BLAKE/TURNER
                                      1          3 /KING/BLAKE/JAMES
                                      0          2 /KING/CLARK
                                      1          3 /KING/CLARK/MILLER
                      14 rows selected

Open in new window

CONNECT_BY_ROOT is another unary operator that returns the name of the root node.
It precedes the column name which root we'd like to select.
SQL> SELECT CONNECT_BY_ROOT ename as root_name, level,sys_connect_by_path(ename,'/') path
                        2    FROM emp
                        3   START WITH mgr IS NULL
                        4  CONNECT BY PRIOR empno = mgr;
                      ROOT_NAME       LEVEL PATH
                      ---------- ---------- -----------------------------------------------------KING                1 /KING
                      KING                2 /KING/JONES
                      KING                3 /KING/JONES/SCOTT
                      KING                4 /KING/JONES/SCOTT/ADAMS
                      KING                3 /KING/JONES/FORD
                      KING                4 /KING/JONES/FORD/SMITH
                      KING                2 /KING/BLAKE
                      KING                3 /KING/BLAKE/ALLEN
                      KING                3 /KING/BLAKE/WARD
                      KING                3 /KING/BLAKE/MARTIN
                      KING                3 /KING/BLAKE/TURNER
                      KING                3 /KING/BLAKE/JAMES
                      KING                2 /KING/CLARK
                      KING                3 /KING/CLARK/MILLER
                      14 rows selected

Open in new window

CONNECT_BY_ROOT is valid only in hierarchical queries. When placed in a query a root node value is shown in this column.
Be aware that you cannot specify this operator in the START WITH condition or the CONNECT BY condition.

As said above it is possible to cause circular loops when condition is not correct. Oracle 10 specified "NOCYCLE"
allowing querying anyway. In conjunction there is another pseudo column, - CONNECT_BY_ISCYCLE - which will evaluate to "1" if the current row references a parent and would create a loop in the tree.

create table test
                          parent  number,
                          child   number
                      insert into test values(null,1);
                      insert into test values(1,2);
                      insert into test values(2,3);
                      insert into test values(3,1);
                      SQL> select connect_by_iscycle,sys_connect_by_path(child,'/') path
                        2    from test
                        3   start with parent is null
                        4   connect by nocycle prior child = parent;
                      CONNECT_BY_ISCYCLE PATH
                      ------------------ --------------------------------------------------------
                                       0 /1
                                       0 /1/2
                                       1 /1/2/3

Open in new window

"Connect by" is not a SQL-92 ANSI standard. Other SQL versions use different constructs for the same function.

Read more

Comments (3)

Kevin CrossChief Technology Officer
Most Valuable Expert 2011

Ivo, nice Article!
You have my Yes vote above by the way.

Mark WillsTopic Advisor
Distinguished Expert 2018

Yep, agree...

Also gets my Yes vote :)



Appreciate if some one provides detailed examples like above on all possible ANALYTIC FUNCTIONS.


Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.