how to fill in a family tree

I work on an Oracle database for a school where we store extensive cross-reference information between constituents. We record parent, child, sibling, grandparent, grandchild, aunt/uncle, niece/nephew, cousin, and many other relationships.

I've been asked to help our data entry personnel fill in the blanks that inevitably form when building some of our constituents' family trees.

Does anyone have an algorithm that can analyze existing, known relationships and discover missing links? I *don't* need to graphically display anything; I only need a simple list as output of the relations and corresponding relationships that are missing.

Our data is stored as very simple records with the IDs of each person and the relationship between them. For readability, I'll use names instead of numbers in an example:

Nancy - parent - John ("Nancy is the parent of John")
John - child - Nancy
John - spouse - Sue
Sue - spouse - John
John - parent - Mary
Mary  - child - John
Sue - parent - Joe
Joe - child - Sue
Bill - sibling - Nancy
Nancy - sibling - Bill

If the above is all that was entered into the database, I am looking for a programmatic way to list all the implied relationships:

Sue - parent - Mary
Mary - child - Sue
John - parent - Joe
Joe - child - John
Nancy - grandparent - Joe
Joe - grandchild - Nancy
Nancy - grandparent - Mary
Mary - grandchild - Nancy
Joe - sibling - Mary
Mary - sibling - Joe
Bill - aunt/uncle - John
John - niece/nephew - Bill
Nancy - parent-in-law - Sue
Sue - child-in-law - Nancy

Any code or guidance would be a big help!
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Ken ButtersCommented:
Just looking at your first line item as an example... I think you are making an incorrect assumption.  -- (based on a lot of genealogy of I've worked on)

you had as implied : Sue - Parent - Mary.

existing :

John - spouse - Sue
John - parent - Mary

If John had Mary by a previous marriage, then Sue would not necessarily be a parent of Mary.  Or is that possibility something you would want to ignore?  or do you consider that  Mother=StepMother?
prinprogAuthor Commented:
Thanks for the observation. I'm choosing to ignore that aspect, and is why we probably don't want to automate the creation of these relationships, but rather just report them. They will probably require some research to see exactly what the situation is. But at least to get them started, I want to be able to present a best guess. Perhaps version 2 can attempt the complexity of half- and step-siblings.
Steven CarnahanAssistant Vice President\Network ManagerCommented:
"Perhaps version 2 can attempt the complexity of half- and step-siblings. " 

Don't forget about adoptions in V2.   :)
Your Guide to Achieving IT Business Success

The IT Service Excellence Tool Kit has best practices to keep your clients happy and business booming. Inside, you’ll find everything you need to increase client satisfaction and retention, become more competitive, and increase your overall success.

Ken ButtersCommented:
What you are asking for is not trivial.

You would need a relationship calculator to determine all the possible relationships.

In order to do that, you could (1) write your own or (2) use existing genealogy software.

Here is an example of a relationship calculator:

Here is a page that has 4 different relationship calculators:

If you use off the shelf genealogy software, most of those can build relationship reports.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
prinprogAuthor Commented:
Thanks pony10us.  :-)  In our system, for our purposes, we would probably code that as a full child and not make the distinction. For that matter, I'm not actually sure how we track half-siblings, though I know we do code step- relationships.

Ken, thank you for the links -- and for confirming that this is not trivial! I've been trying to tell my clients that while it's easy to think through, it's not easy to code. I'll have a look at those links.

I'm still open to more suggestions and observations if anyone thinks of other complexities or ideas.

Thanks for the great feedback so far!
I think that I would look at things differently.  Instead of building a result based on what you have, build a result based on what is missing.
1.  Missing reciprocal relationship (John-Parent-Mary requires Mary-Child-John)
2.  Missing relationship:  (Mary-Parent-Missing)

Then, if you want to go further you can attempt to match Missing based on assumed relationships (Mary-Sibling-Parent -> Mary-PotentialParent, Mary-Parent-Spouse -> Mary-PotentialParent)  Recognizing that this step is speculative.

Beware of False transitivity, My cousin's cousin is not necessarily my cousin.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.