SQL query for parent child relationship

gvamsimba
gvamsimba used Ask the Experts™
on
Table A

ID  Parentkey   childkey
1    101                55
2    55                 102
3    102              104
4    104              106
5    201                    214      

Table B

Parentchildkey
101
55
104
201
214

Hi I have Table A and Table B.
From Table B I need to select only those parentchildkeys from table B where ALL its family members in Table A are also in Table B.

For example, if you take value 101 in table B, it has that value in Parentkey column in Table A and it also has a child 55 and since 55 is also a parent which got another child 102 and since 102 does not exist in Table B, nothing from this FAMILY in Table A should be selected in my query. But if you take value 201 from table b,you
can see that this exists in Table A a parent but its child 214 also exists in Table B. so my query can have 201 and 214 .

can anyone please give me that SQL query ?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Bill PrewIT / Software Engineering Consultant
Top Expert 2016

Commented:
What database are you working with? (Oracle, MySQL, SQL Server, ...)

The SQL syntax for dealing with these sort of parent child tree data relationships tends to vary from DBMS platform to platform.


»bp
gvamsimbaIT Consultant

Author

Commented:
HI Bill, I am using DB2

Thanks
Hi gvamsimba,

How far do you want the look for matches.  The top level, top 2 levels, until all possible items are resolved?

If you want a fixed number of levels, it can be solved with straight SQL.  If you want all possible levels, recursive SQL will do the trick, but it might be a bit inefficient if you have a huge table.

I'll be glad to help with either SQL,
Kent
PMI ACP® Project Management

Prepare for the PMI Agile Certified Practitioner (PMI-ACP)® exam, which formally recognizes your knowledge of agile principles and your skill with agile techniques.

gvamsimbaIT Consultant

Author

Commented:
it would be great if I can have both kent. on the fixed number lets assume the maximum levels is 5.
Thank you
Another complication to be resolved is duplication.  That is if any item is a valid parent and child, multiple chains will terminate with the same node.

A -> B -> C

When looking at the objects to see if they are valid parent nodes, both A and B will be found.  Even though B -> C is a valid chain, it is also a sub-chain of A -> B -> C.

And what about circular links where A -> B -> C exists, as well as B -> A -> C .  Of course for both to exist there are infinite paths.  SQL will attempts to construct A -> B -> A -> B -> ... -> C.
gvamsimbaIT Consultant

Author

Commented:
Ok Kent assuming there are no circular links can you please provide the sql ?

Thanks
Here you go....

This recursive SQL will find all of the items when the parent, children, and children in table A exist in table B.

There's a hard stop at 5 levels to keep bad or unexpected data from producing an infinite loop.

WITH x (ParentKey, ChildKey, NewChild, Depth, Success)
AS
(
  SELECT ParentKey, ChildKey, ChildKey, 1, 1
  FROM A 
  WHERE Parentkey in (SELECT Key FROM b)
    AND ChildKey in (SELECT Key FROM b)
  UNION ALL
  SELECT x.ParentKey, x.ChildKey, a.Childkey, Depth+1, case when a.childkey in (SELECT key FROM b) then 1 else 0 end
  FROM x, a 
  WHERE x.NewChild = a.parentkey
    AND Depth < 5
)
SELECT * FROM x 
WHERE x.Depth = (SELECT max(depth) FROM x x1 WHERE x.ParentKey = x1.ParentKey)
  AND Success = 1;

Open in new window

The SQL syntax for dealing with these sort of parent child tree data relationships tends to vary from DBMS platform to platform.
If you're using ANSI SQL, this statement is just WRONG!
Doug BishopDatabase Developer

Commented:
@Alex: I suppose it depends on what version of ANSI SQL, what database you are using, and what version of the database you are using.

I can write code using ANSI SQL 2008 that would NEVER run under SQL Server 2000. So which database and what version are relevant.
In the end, we're talking here of more or less simple ANSI joins, which should work on virtually any DB ;-)
Doug BishopDatabase Developer

Commented:
But I'd rather use Kent's solution than have a query with 5 self joins (and what if he wanted to go 10 levels deep, or 20. Your query would be a nightmare.
A statement like
If you're using ANSI SQL, this statement is just WRONG!
may be "technically" correct but, without backing it up with reasoning, helps nobody. Yes, I could use 1992, or even earlier ANSI SQL standards to write this, but why not use some of the possibly even non-ANSI standard code available in my particular flavor of SQL (e.g. T-SQL, PL-SQL, or even DB2's flavor of SQL, whatever it is called).
but why not use some of the possibly even non-ANSI standard code available in my particular flavor of SQL (e.g. T-SQL, PL-SQL, or even DB2's flavor of SQL, whatever it is called).
I partially agree with you, but one has to keep in mind if doing so, that you put yourself in a position, where you cannot migrate database stuff (SQL statements etc.) from one vendor to another ;-)
gvamsimbaIT Consultant

Author

Commented:
Hi Kent, what is x in your query below ? which table is that ?

UNION ALL
  SELECT x.ParentKey, x.ChildKey, a.Childkey, Depth+1, case when a.childkey in (SELECT key FROM b) then 1 else 0 end
  FROM x, a
  WHERE x.NewChild = a.parentkey
    AND Depth < 5

Thanks
gvamsimbaIT Consultant

Author

Commented:
thank you so much Kent. your query is working like a charm.

Thanks
The query is IBM's version of recursive SQL using a Common Table Expression (CTE).  Other DBMS also support CTEs and their recursive SQL looks very similar, but there are some subtle differences.

The expression begins with "WITH x".  This identifies an in-line (temporary) table called x.  The items in parentheses are the columns to be returned by the query in the CTE and match the columns in the contained query.  The query calls itself (x), much like you would do in C, C++, Java, PASCAL, or any other recursive language.

It's also got a small bug in it.  :(  As I look back at it, I realize that the success flag is being set based on the last child item in the chain, when it should be set to fail if any child is not contained in table B.

Here's the corrected code.  The case statement has an extra condition, but the rest of the query is unchanged.

WITH x (ParentKey, ChildKey, NewChild, Depth, Success)
AS
(
  SELECT ParentKey, ChildKey, ChildKey, 1, 1
  FROM A 
  WHERE Parentkey in (SELECT Key FROM b)
    AND ChildKey in (SELECT Key FROM b)
  UNION ALL
  SELECT x.ParentKey, x.ChildKey, a.Childkey, Depth+1, 
    case when x.success = 0 then 0 
         when a.childkey in (SELECT key FROM b) then 1 
         else 0 
    end
  FROM x, a 
  WHERE x.NewChild = a.parentkey
    AND Depth < 5
)
SELECT * FROM x 
WHERE x.Depth = (SELECT max(depth) FROM x x1 WHERE x.ParentKey = x1.ParentKey)
  AND Success = 1;

Open in new window

gvamsimbaIT Consultant

Author

Commented:
Thank you so much Kent. Much Appreciated.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial