Link to home
Start Free TrialLog in
Avatar of gvamsimba
gvamsimbaFlag for United Kingdom of Great Britain and Northern Ireland

asked on

SQL query for parent child relationship

Table A

ID  Parentkey   childkey
1    101                55
2    55                 102
3    102              104
4    104              106
5    201                    214      

Table B

Parentchildkey
101
55
104
201
214

Hi I have Table A and Table B.
From Table B I need to select only those parentchildkeys from table B where ALL its family members in Table A are also in Table B.

For example, if you take value 101 in table B, it has that value in Parentkey column in Table A and it also has a child 55 and since 55 is also a parent which got another child 102 and since 102 does not exist in Table B, nothing from this FAMILY in Table A should be selected in my query. But if you take value 201 from table b,you
can see that this exists in Table A a parent but its child 214 also exists in Table B. so my query can have 201 and 214 .

can anyone please give me that SQL query ?
Avatar of Bill Prew
Bill Prew

What database are you working with? (Oracle, MySQL, SQL Server, ...)

The SQL syntax for dealing with these sort of parent child tree data relationships tends to vary from DBMS platform to platform.


»bp
Avatar of gvamsimba

ASKER

HI Bill, I am using DB2

Thanks
Hi gvamsimba,

How far do you want the look for matches.  The top level, top 2 levels, until all possible items are resolved?

If you want a fixed number of levels, it can be solved with straight SQL.  If you want all possible levels, recursive SQL will do the trick, but it might be a bit inefficient if you have a huge table.

I'll be glad to help with either SQL,
Kent
it would be great if I can have both kent. on the fixed number lets assume the maximum levels is 5.
Thank you
Another complication to be resolved is duplication.  That is if any item is a valid parent and child, multiple chains will terminate with the same node.

A -> B -> C

When looking at the objects to see if they are valid parent nodes, both A and B will be found.  Even though B -> C is a valid chain, it is also a sub-chain of A -> B -> C.

And what about circular links where A -> B -> C exists, as well as B -> A -> C .  Of course for both to exist there are infinite paths.  SQL will attempts to construct A -> B -> A -> B -> ... -> C.
Ok Kent assuming there are no circular links can you please provide the sql ?

Thanks
ASKER CERTIFIED SOLUTION
Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The SQL syntax for dealing with these sort of parent child tree data relationships tends to vary from DBMS platform to platform.
If you're using ANSI SQL, this statement is just WRONG!
@Alex: I suppose it depends on what version of ANSI SQL, what database you are using, and what version of the database you are using.

I can write code using ANSI SQL 2008 that would NEVER run under SQL Server 2000. So which database and what version are relevant.
In the end, we're talking here of more or less simple ANSI joins, which should work on virtually any DB ;-)
But I'd rather use Kent's solution than have a query with 5 self joins (and what if he wanted to go 10 levels deep, or 20. Your query would be a nightmare.
A statement like
If you're using ANSI SQL, this statement is just WRONG!
may be "technically" correct but, without backing it up with reasoning, helps nobody. Yes, I could use 1992, or even earlier ANSI SQL standards to write this, but why not use some of the possibly even non-ANSI standard code available in my particular flavor of SQL (e.g. T-SQL, PL-SQL, or even DB2's flavor of SQL, whatever it is called).
but why not use some of the possibly even non-ANSI standard code available in my particular flavor of SQL (e.g. T-SQL, PL-SQL, or even DB2's flavor of SQL, whatever it is called).
I partially agree with you, but one has to keep in mind if doing so, that you put yourself in a position, where you cannot migrate database stuff (SQL statements etc.) from one vendor to another ;-)
Hi Kent, what is x in your query below ? which table is that ?

UNION ALL
  SELECT x.ParentKey, x.ChildKey, a.Childkey, Depth+1, case when a.childkey in (SELECT key FROM b) then 1 else 0 end
  FROM x, a
  WHERE x.NewChild = a.parentkey
    AND Depth < 5

Thanks
thank you so much Kent. your query is working like a charm.

Thanks
The query is IBM's version of recursive SQL using a Common Table Expression (CTE).  Other DBMS also support CTEs and their recursive SQL looks very similar, but there are some subtle differences.

The expression begins with "WITH x".  This identifies an in-line (temporary) table called x.  The items in parentheses are the columns to be returned by the query in the CTE and match the columns in the contained query.  The query calls itself (x), much like you would do in C, C++, Java, PASCAL, or any other recursive language.

It's also got a small bug in it.  :(  As I look back at it, I realize that the success flag is being set based on the last child item in the chain, when it should be set to fail if any child is not contained in table B.

Here's the corrected code.  The case statement has an extra condition, but the rest of the query is unchanged.

WITH x (ParentKey, ChildKey, NewChild, Depth, Success)
AS
(
  SELECT ParentKey, ChildKey, ChildKey, 1, 1
  FROM A 
  WHERE Parentkey in (SELECT Key FROM b)
    AND ChildKey in (SELECT Key FROM b)
  UNION ALL
  SELECT x.ParentKey, x.ChildKey, a.Childkey, Depth+1, 
    case when x.success = 0 then 0 
         when a.childkey in (SELECT key FROM b) then 1 
         else 0 
    end
  FROM x, a 
  WHERE x.NewChild = a.parentkey
    AND Depth < 5
)
SELECT * FROM x 
WHERE x.Depth = (SELECT max(depth) FROM x x1 WHERE x.ParentKey = x1.ParentKey)
  AND Success = 1;

Open in new window

Thank you so much Kent. Much Appreciated.