# Need help with a recursive type of query

Hello...

I have a table TEST with 5 numeric columns namely, SRC, F, L, R, U

Each row in this table represents a direct link between SRC and the other columns (F, L, R, U). A value of 0 means no link exists. The values of F, L, R, U may (or may not) be the SRC in subsequent rows. In case F/L/R/U does not appear as a value of SRC, it indicates last node.

Here are some sample values from this table (I have padded them with leading 0 for the sake of alignment):

SRC  F     L     R    U
---   ---   ---   ---   ---
01   04   00   02   20
02   00   00   00   03
03   04   00   00   00
04   05   00   00   00
05   08   00   06   00
08   00   00   09   00
09   00   10   00   00
10   00   00   11   00
11   00   00   00   12
12   00   00   13   00
13   00   00   14   00

Now the way one would read this table is...

SRC 1 is connected to F:4, R:2 and U:20
SRC 2 is connected to U:3
SRC 3 is connected to F:4

...and so on.

Now the query I need is for the following:

Given a value for SRC and any other value that would be present in either F, L, R or U, I need to find the possible paths.

For example, if SRC = 1 and we need to reach to 11 then we would have the following traversal paths:

1 - 4 - 5 - 8 - 9 - 10 - 11
1 - 2 - 3 - 4 - 5 - 8 - 9 - 10 - 11

I don't think it would be possible with a single query and hence we may use a table to hold the results, and this table would look something like:

For path 1 - 4 - 5 - 8 - 9 - 10 - 11

SRC  PATH  DEST
----- -------- -------
1       1        4
1       1        5
1       1        8
1       1        9
1       1        10
1       1        11

For path 1 - 2 - 3 - 4 - 5 - 8 - 9 - 10 - 11

SRC  PATH  DEST
----- -------- -------
1       2        2
1       2        3
1       2        4
1       2        5
1       2        8
1       2        9
1       2        10
1       2        11

Thanks,
Abhijit
LVL 1
###### Who is Participating?

Commented:
This turns out to be a member of an enormously tough class of problems because the size grows exponentially with the allowed path length.

In order to find all the paths between any two nodes, you have to build ALL possible paths until you are sure they do not arrive at the destination.  You also have to find a way to avlid loops.

You examples above seems to imply the assumption that the paths can be limited to nodes in sorted order.  If that is the case, you have caught a break.  That constraint would imply a flat graph with relationships that only need be navigated one direction.  earthman2 is correct that "connect by prior" extension can be useful, however, be careful for it is a highly non-relational language element.  The order of the results is part of the data which means that it cannot behave like a table.  (For reference on this point, you can look up Codd and Date's 12 rules of relational databases.  "Order is irrelevant")

If I had to solve this, I would approach it the following way:
1)  Build a normalized table containing the simplifed relationships, e.g. CREATE TABLE Relate AS (SRC number, DEST number) and populate it with the valid SRC/DEST pairs.  You can do this with a four select statements using a UNION ALL between them.
2) Make sure you have a unique index on the table with both columns (SRC, DEST).
3) Use the connect-by-prior and start-with syntax as well as the SYS_CONNECT_BY_PATH function in a SELECT statement

SELECT 1, SYS_CONNECT_BY_PATH (DEST, '-')
FROM NormalizeTable
AND CONNECT BY PRIOR DEST = SRC
AND (CONNECT_BY_ISLEAF = 0 OR DEST = 11)
AND DEST <= 11

I am not sure how complete the implementation of this is for PostgreSQL beyond the CONNECT BY PRIOR and the START WITH clauses.  Oracle supports the CONNECT_BY_ISLEAF function that returns a value of one if you are at a leaf node.  It also supports a SYS_CONNECT_BY_PATH function that can return the path in the form of "1-4-5-8-9-10-11" which, I suspect, is the only way you can get the specific paths you want back as paths.

If you choose not to (or the patch does not implement it) the hierarchies are returned in top-to-bottom, left-to-right order.  That means you will get the root node, then the direct children as siblings, then the grand children, etc.

Another approach that might prune the problem down in size and run faster is to work the paths from the destination to the source.  To do this, you need to:
- reverse the order of the columns in the index on the normalized table
- reverse the terms of the CONNECT BY PRIOR clause to be "CONNECT BY PRIOR SRC = DEST"
- change the condition ORed with the CONNECT_BY_ISLEAF = 0 to be "OR SRC = 1"

Of course, through all of this, you can drop variables in for constants 1 and 11 for SRC and DEST.

Get the extension and see how much functionality it supports.

Best of luck,
Bill

0

Commented:
Try the hier patch.

This enables an Oracle like connect by prior hierarchical query syntax

http://gppl.moonbone.ru/hier-Pg8.1.2-0.5.5.diff.gz
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.