Link to home
Start Free TrialLog in
Avatar of MariaHalt
MariaHaltFlag for United States of America

asked on

Identify a family of records

Let's say I have a bunch of records where a child record may have one or more parent records.  I don't have a familyid, I will never have a familyid, but I need to identify the families based on the child-parent relationships.  

How can I create families based solely on the relationships?  See the attached image, Families.jpg, for better explanation.

CREATE TABLE tempdb.dbo.mh_children_parents (ChildId INT NOT NULL, ParentId INT NOT NULL)

INSERT INTO tempdb.dbo.mh_children_parents VALUES (1, 1)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (2, 2)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (2, 3)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (3, 3)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (4, 1)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (4, 3)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (5, 4)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (6, 5)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (7, 6)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (8, 4)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (8, 5)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (8, 6)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (9, 7)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (9, 8)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (10, 7)
INSERT INTO tempdb.dbo.mh_children_parents VALUES (10, 8)

SELECT * FROM tempdb.dbo.mh_children_parents
Avatar of MariaHalt
MariaHalt
Flag of United States of America image

ASKER

User generated image
Avatar of Jim Horn
Looking at the image, what's the logic for determining that P1-3 are related to Family A, P4-6 is related to Family B, and P7-8 is related to Family C?
>>"I don't have a familyid, I will never have a familyid"

and you never will have those records arranged into families.

If you can't provide that family relationship in the data somehow no amount of SQL can make up for that data deficiency.

---------
by the way, it is usually preferred to "Embed" images so readers don't have to flip between screens. There are 2 ways of doing this, but the easiest is the "Inset Image" button (on the right of) the toolbar
ASKER CERTIFIED SOLUTION
Avatar of bcnagel
bcnagel
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I am not dismissing it. All I am trying to indicate is that sql cannot entirely compensate for data deficiencies.

Do you have the results of your choice? As I'm only on a phone I can't trial your code.
Hi Paul.

The result of the query that I posted looks like this:

FamilyType	FamilyId		ParentId		ChildId
Multi-parent	1			1			1
Multi-parent	1			2			2
Multi-parent	1			3			2
Multi-parent	1			3			3
Multi-parent	1			1			4
Multi-parent	1			3			4
Multi-parent	2			4			5
Multi-parent	2			5			6
Multi-parent	2			6			7
Multi-parent	2			4			8
Multi-parent	2			5			8
Multi-parent	2			6			8
Multi-parent	3			7			9
Multi-parent	3			8			9
Multi-parent	3			7			10
Multi-parent	3			8			10

Open in new window

@bcnagel

I am eating my humble pie. Very nicely done! Where having you been hiding those great skills?

Paul

------
While this is NOT a  criticism, the cost of the query (without indexing) involves 13 Table Scans of [mh_children_parents].
Perhaps using bcnagel's solution to set a family_id value would be worth considering.

You may see an execution plan here: http://sqlfiddle.com/#!6/a00a3/6/0
Thanks for the props, Paul. I'm betting even Jim Horn recognizes the collaborative spirit of your table scan comment. :-) You're right about the inefficiency--I was trying to code mostly for readability.

I think MariaHalt's question is a fun one. I'd like to see other strategies coded up.
{chuckle} Jim's concerned about possible non-question related side issues appearing, and there isn't much comment history by yourself to work with (hint). He's a good guy.

Maybe others will take up your challenge, I am a tad busy right now so I'll probably pass on that.

I do think that persisting the result should be considered because I presume this grouping by family will be needed more than once.
Just ran the cte, you rock!  Thank you!
@bcnagel,

You rock!  I was so excited I clicked the wrong solution.  Sorry, I'll get it fixed.
Glad to help!

I'd love to get a sense of the final code you use, if that's possible. If you can post an extract once you have things finalized, that would be great. I'm guessing it would be helpful for anyone else who runs into this kind of need, too.