• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 509
  • Last Modified:

how to fill in a family tree

I work on an Oracle database for a school where we store extensive cross-reference information between constituents. We record parent, child, sibling, grandparent, grandchild, aunt/uncle, niece/nephew, cousin, and many other relationships.

I've been asked to help our data entry personnel fill in the blanks that inevitably form when building some of our constituents' family trees.

Does anyone have an algorithm that can analyze existing, known relationships and discover missing links? I *don't* need to graphically display anything; I only need a simple list as output of the relations and corresponding relationships that are missing.

Our data is stored as very simple records with the IDs of each person and the relationship between them. For readability, I'll use names instead of numbers in an example:

Nancy - parent - John ("Nancy is the parent of John")
John - child - Nancy
John - spouse - Sue
Sue - spouse - John
John - parent - Mary
Mary  - child - John
Sue - parent - Joe
Joe - child - Sue
Bill - sibling - Nancy
Nancy - sibling - Bill

If the above is all that was entered into the database, I am looking for a programmatic way to list all the implied relationships:

Sue - parent - Mary
Mary - child - Sue
John - parent - Joe
Joe - child - John
Nancy - grandparent - Joe
Joe - grandchild - Nancy
Nancy - grandparent - Mary
Mary - grandchild - Nancy
Joe - sibling - Mary
Mary - sibling - Joe
Bill - aunt/uncle - John
John - niece/nephew - Bill
Nancy - parent-in-law - Sue
Sue - child-in-law - Nancy

Any code or guidance would be a big help!
1 Solution
Ken ButtersCommented:
Just looking at your first line item as an example... I think you are making an incorrect assumption.  -- (based on a lot of genealogy of I've worked on)

you had as implied : Sue - Parent - Mary.

existing :

John - spouse - Sue
John - parent - Mary

If John had Mary by a previous marriage, then Sue would not necessarily be a parent of Mary.  Or is that possibility something you would want to ignore?  or do you consider that  Mother=StepMother?
prinprogAuthor Commented:
Thanks for the observation. I'm choosing to ignore that aspect, and is why we probably don't want to automate the creation of these relationships, but rather just report them. They will probably require some research to see exactly what the situation is. But at least to get them started, I want to be able to present a best guess. Perhaps version 2 can attempt the complexity of half- and step-siblings.
"Perhaps version 2 can attempt the complexity of half- and step-siblings. " 

Don't forget about adoptions in V2.   :)
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Ken ButtersCommented:
What you are asking for is not trivial.

You would need a relationship calculator to determine all the possible relationships.

In order to do that, you could (1) write your own or (2) use existing genealogy software.

Here is an example of a relationship calculator:

Here is a page that has 4 different relationship calculators:

If you use off the shelf genealogy software, most of those can build relationship reports.
prinprogAuthor Commented:
Thanks pony10us.  :-)  In our system, for our purposes, we would probably code that as a full child and not make the distinction. For that matter, I'm not actually sure how we track half-siblings, though I know we do code step- relationships.

Ken, thank you for the links -- and for confirming that this is not trivial! I've been trying to tell my clients that while it's easy to think through, it's not easy to code. I'll have a look at those links.

I'm still open to more suggestions and observations if anyone thinks of other complexities or ideas.

Thanks for the great feedback so far!
I think that I would look at things differently.  Instead of building a result based on what you have, build a result based on what is missing.
1.  Missing reciprocal relationship (John-Parent-Mary requires Mary-Child-John)
2.  Missing relationship:  (Mary-Parent-Missing)

Then, if you want to go further you can attempt to match Missing based on assumed relationships (Mary-Sibling-Parent -> Mary-PotentialParent, Mary-Parent-Spouse -> Mary-PotentialParent)  Recognizing that this step is speculative.

Beware of False transitivity, My cousin's cousin is not necessarily my cousin.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now