[Last Call] Learn how to a build a cloud-first strategyRegister Now


how to fill in a family tree

Posted on 2014-01-08
Medium Priority
Last Modified: 2014-01-31
I work on an Oracle database for a school where we store extensive cross-reference information between constituents. We record parent, child, sibling, grandparent, grandchild, aunt/uncle, niece/nephew, cousin, and many other relationships.

I've been asked to help our data entry personnel fill in the blanks that inevitably form when building some of our constituents' family trees.

Does anyone have an algorithm that can analyze existing, known relationships and discover missing links? I *don't* need to graphically display anything; I only need a simple list as output of the relations and corresponding relationships that are missing.

Our data is stored as very simple records with the IDs of each person and the relationship between them. For readability, I'll use names instead of numbers in an example:

Nancy - parent - John ("Nancy is the parent of John")
John - child - Nancy
John - spouse - Sue
Sue - spouse - John
John - parent - Mary
Mary  - child - John
Sue - parent - Joe
Joe - child - Sue
Bill - sibling - Nancy
Nancy - sibling - Bill

If the above is all that was entered into the database, I am looking for a programmatic way to list all the implied relationships:

Sue - parent - Mary
Mary - child - Sue
John - parent - Joe
Joe - child - John
Nancy - grandparent - Joe
Joe - grandchild - Nancy
Nancy - grandparent - Mary
Mary - grandchild - Nancy
Joe - sibling - Mary
Mary - sibling - Joe
Bill - aunt/uncle - John
John - niece/nephew - Bill
Nancy - parent-in-law - Sue
Sue - child-in-law - Nancy

Any code or guidance would be a big help!
Question by:prinprog
LVL 19

Expert Comment

by:Ken Butters
ID: 39765770
Just looking at your first line item as an example... I think you are making an incorrect assumption.  -- (based on a lot of genealogy of I've worked on)

you had as implied : Sue - Parent - Mary.

existing :

John - spouse - Sue
John - parent - Mary

If John had Mary by a previous marriage, then Sue would not necessarily be a parent of Mary.  Or is that possibility something you would want to ignore?  or do you consider that  Mother=StepMother?

Author Comment

ID: 39765817
Thanks for the observation. I'm choosing to ignore that aspect, and is why we probably don't want to automate the creation of these relationships, but rather just report them. They will probably require some research to see exactly what the situation is. But at least to get them started, I want to be able to present a best guess. Perhaps version 2 can attempt the complexity of half- and step-siblings.
LVL 26

Expert Comment

ID: 39765844
"Perhaps version 2 can attempt the complexity of half- and step-siblings. " 

Don't forget about adoptions in V2.   :)
NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

LVL 19

Accepted Solution

Ken Butters earned 2000 total points
ID: 39765937
What you are asking for is not trivial.

You would need a relationship calculator to determine all the possible relationships.

In order to do that, you could (1) write your own or (2) use existing genealogy software.

Here is an example of a relationship calculator:

Here is a page that has 4 different relationship calculators:

If you use off the shelf genealogy software, most of those can build relationship reports.

Author Comment

ID: 39766144
Thanks pony10us.  :-)  In our system, for our purposes, we would probably code that as a full child and not make the distinction. For that matter, I'm not actually sure how we track half-siblings, though I know we do code step- relationships.

Ken, thank you for the links -- and for confirming that this is not trivial! I've been trying to tell my clients that while it's easy to think through, it's not easy to code. I'll have a look at those links.

I'm still open to more suggestions and observations if anyone thinks of other complexities or ideas.

Thanks for the great feedback so far!
LVL 15

Expert Comment

ID: 39824305
I think that I would look at things differently.  Instead of building a result based on what you have, build a result based on what is missing.
1.  Missing reciprocal relationship (John-Parent-Mary requires Mary-Child-John)
2.  Missing relationship:  (Mary-Parent-Missing)

Then, if you want to go further you can attempt to match Missing based on assumed relationships (Mary-Sibling-Parent -> Mary-PotentialParent, Mary-Parent-Spouse -> Mary-PotentialParent)  Recognizing that this step is speculative.

Beware of False transitivity, My cousin's cousin is not necessarily my cousin.

Featured Post

NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Today, the web development industry is booming, and many people consider it to be their vocation. The question you may be asking yourself is – how do I become a web developer?
Shell script to create broker configuration file using current broker Configuration, solely for purpose of backup on Linux. Script may need to be modified depending on OS-installation. Please deploy and verify the script in a test environment.
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.
Six Sigma Control Plans

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question