MariaHalt
asked on
Identify a family of records
Let's say I have a bunch of records where a child record may have one or more parent records. I don't have a familyid, I will never have a familyid, but I need to identify the families based on the child-parent relationships.
How can I create families based solely on the relationships? See the attached image, Families.jpg, for better explanation.
CREATE TABLE tempdb.dbo.mh_children_par ents (ChildId INT NOT NULL, ParentId INT NOT NULL)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (1, 1)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (2, 2)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (2, 3)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (3, 3)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (4, 1)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (4, 3)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (5, 4)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (6, 5)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (7, 6)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (8, 4)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (8, 5)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (8, 6)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (9, 7)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (9, 8)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (10, 7)
INSERT INTO tempdb.dbo.mh_children_par ents VALUES (10, 8)
SELECT * FROM tempdb.dbo.mh_children_par ents
How can I create families based solely on the relationships? See the attached image, Families.jpg, for better explanation.
CREATE TABLE tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
INSERT INTO tempdb.dbo.mh_children_par
SELECT * FROM tempdb.dbo.mh_children_par
Looking at the image, what's the logic for determining that P1-3 are related to Family A, P4-6 is related to Family B, and P7-8 is related to Family C?
>>"I don't have a familyid, I will never have a familyid"
and you never will have those records arranged into families.
If you can't provide that family relationship in the data somehow no amount of SQL can make up for that data deficiency.
---------
by the way, it is usually preferred to "Embed" images so readers don't have to flip between screens. There are 2 ways of doing this, but the easiest is the "Inset Image" button (on the right of) the toolbar
and you never will have those records arranged into families.
If you can't provide that family relationship in the data somehow no amount of SQL can make up for that data deficiency.
---------
by the way, it is usually preferred to "Embed" images so readers don't have to flip between screens. There are 2 ways of doing this, but the easiest is the "Inset Image" button (on the right of) the toolbar
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I am not dismissing it. All I am trying to indicate is that sql cannot entirely compensate for data deficiencies.
Do you have the results of your choice? As I'm only on a phone I can't trial your code.
Do you have the results of your choice? As I'm only on a phone I can't trial your code.
Hi Paul.
The result of the query that I posted looks like this:
The result of the query that I posted looks like this:
FamilyType FamilyId ParentId ChildId
Multi-parent 1 1 1
Multi-parent 1 2 2
Multi-parent 1 3 2
Multi-parent 1 3 3
Multi-parent 1 1 4
Multi-parent 1 3 4
Multi-parent 2 4 5
Multi-parent 2 5 6
Multi-parent 2 6 7
Multi-parent 2 4 8
Multi-parent 2 5 8
Multi-parent 2 6 8
Multi-parent 3 7 9
Multi-parent 3 8 9
Multi-parent 3 7 10
Multi-parent 3 8 10
@bcnagel
I am eating my humble pie. Very nicely done! Where having you been hiding those great skills?
Paul
------
While this is NOT a criticism, the cost of the query (without indexing) involves 13 Table Scans of [mh_children_parents].
Perhaps using bcnagel's solution to set a family_id value would be worth considering.
You may see an execution plan here: http://sqlfiddle.com/#!6/a00a3/6/0
I am eating my humble pie. Very nicely done! Where having you been hiding those great skills?
Paul
------
While this is NOT a criticism, the cost of the query (without indexing) involves 13 Table Scans of [mh_children_parents].
Perhaps using bcnagel's solution to set a family_id value would be worth considering.
You may see an execution plan here: http://sqlfiddle.com/#!6/a00a3/6/0
Thanks for the props, Paul. I'm betting even Jim Horn recognizes the collaborative spirit of your table scan comment. :-) You're right about the inefficiency--I was trying to code mostly for readability.
I think MariaHalt's question is a fun one. I'd like to see other strategies coded up.
I think MariaHalt's question is a fun one. I'd like to see other strategies coded up.
{chuckle} Jim's concerned about possible non-question related side issues appearing, and there isn't much comment history by yourself to work with (hint). He's a good guy.
Maybe others will take up your challenge, I am a tad busy right now so I'll probably pass on that.
I do think that persisting the result should be considered because I presume this grouping by family will be needed more than once.
Maybe others will take up your challenge, I am a tad busy right now so I'll probably pass on that.
I do think that persisting the result should be considered because I presume this grouping by family will be needed more than once.
ASKER
Just ran the cte, you rock! Thank you!
ASKER
@bcnagel,
You rock! I was so excited I clicked the wrong solution. Sorry, I'll get it fixed.
You rock! I was so excited I clicked the wrong solution. Sorry, I'll get it fixed.
Glad to help!
I'd love to get a sense of the final code you use, if that's possible. If you can post an extract once you have things finalized, that would be great. I'm guessing it would be helpful for anyone else who runs into this kind of need, too.
I'd love to get a sense of the final code you use, if that's possible. If you can post an extract once you have things finalized, that would be great. I'm guessing it would be helpful for anyone else who runs into this kind of need, too.
ASKER