Link to home
Create AccountLog in
Avatar of ChoobsTech
ChoobsTechFlag for Switzerland

asked on

recursive mysql query

Dear experts,

I have a table with entities. They have parent child relations, i.e. so entity "dad" would be entity 1 have parent 0, and child "firstC" would have as parent "dad" and could have a child "first_first_child" and so on...

entities        |    entities_parent_child  |    
----------       |   ---------------------------   |  
identities     |   idparent                       |  
name           |   identity                        |  
---------        |  ----------------------------  |

Selecting first child of a parent would be:
SELECT *.e FROM entities AS e, entities_parent_child AS p WHERE e.identity=p.entities AND idparent = "XYZ_id"

How can I make a query getting all child of first child without using php?
I know it has to be something like:
SELECT *.e FROM entities AS e, entities_parent_child AS p WHERE e.identity=p.entities AND idparent = "XYZ_id"  OR e.identities= ANY (SELECT * FROM entities WHERE identities= "FIRST CHILD ID")


What I would like is following in a single Mysql query:

let's say entity with id 525 has as name BOB wich has childs I wanna know the names of:

function get_childs($parent)
{
$q="SELECT identity FROM entity WHERE identity= ANY (SELECT identity FROM entities_parent_child WHERE idparent='525')";
$query=mysql($q);

 while ($row = mysql_fetch_array($query))
 {
echo "<table><tr><td>-</td></tr>";
  echo "<tr><td>entity name:".$row['name']."</td></tr>";
get_childs($row['identity']);
echo "</td></tr></table>";
 }
}

get_childs('525');

Will get me result:
BOB
-linda
--Nuria
---Brian
---Marcel
----Elvis
-susie
--Henry
-marcel

And it will get me childs of bob (linda, susie and marcel) and child of childs (linda has nuria as child) and childs 3rd genration (nuria is parent of Brian)

It will take all bob's decendance. I would like to have this into a single query to be able to use SQL features...

Doest anyone have an idea on how to achieve it with a query of reasonable duration?
Avatar of dqmq
dqmq
Flag of United States of America image

This is a followup to a question I already worked on.  I am not looking for more points, just to clarify the requirements as I understand them.  

The author has an Entities table and an Entities_Parent_Child table which is used to represent a hierarchy of Entities.  The latter table as a row for each child entity that references its parent.  In otherwords, it has two columns: EntityID (EntityID of the child) and ParentID ( EntityID of the parent)

Given a known Entity, the requirements are to produce a list of that Entity and all it's descendents using SQL.   I could not help because I do know how to do a recursive query in MySQL.



Avatar of Terry Woods
Looks like you may be able to do what you want with a stored procedure, although a google search for
  mysql stored procedure recursive
seemed to indicate it might be a bit buggy! Are you interested in doing that?

Alternatively, I think you could create a single SQL (possibly a messy one - have to think about that some more...) provided that you set a maximum limit for depth into the tree hierarchy. Are you willing to do that?
Avatar of ChoobsTech

ASKER

Yes a recursive stored procedure might be of great help :)

I made some research....

Found some interresting stuff but it's still not exactly what I want on following link:
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
http://www.sitepoint.com/article/hierarchical-data-database

the 2 methods they advance:

1-The Adjacency List Model
 It's more or less what I am doing now and it is has many flaws, I cannot use it as I have an unlimited hierarchy.
2-preorder tree traversal algorithm
 I have tested it and is too slow on updates as I work with ajax.

My idea so far is to store the "family" names as
1.0 for first parent, 1.1 for second parent
1.0.1 for first child, 1.0.2 for second child, and so on...

Therefore to select whole genealogic tree,  would be SELECT * FROM entity where family="1.0%" but this is storing high amount of data which is not suitable...
I believe your "family" names approach is generally referred to as an "enumerated list" (though you can express the hierarchy in words just as well--like a hard drive directory structure).  

The main disadvantages besides space are that you don't get RI and many times navigating the hierarchy requires substring manipulation and tends to defeat indexing for performance.  For example, while it might be easy to list all descendents in a single SQL statement, listing all anscestors in a single SQL statement is not so easy.

The tree traversal approach is indeed costly for updates.  You can mitigate that somewhat with a modified version that permits gaps in the numbering scheme.    
Thank you for your precision :)

I already figured out some flaws of my "path" (enumerated list) approach, tried to improve it, getting rid of the obvious data, i.e. the data in between which made me come right back in a transversal approach.

At current state I am thinking transveral approach is the only satisfying solution (for what I want do, I do believe other methods are viable depending on situations).

But it's still not satisfying for me, as it is ridicoulous to update almost all your scheme just to add a single element, so if you have a "somewhat modified version" or any other idea it is more than welcome as I spent the last 24hrs on it and feel quite stuck to be honnest :(
Another flaw I forgot to mention about the transversal approach is the case of a child having two parents, which on the simple thought of it is giving me headackes...
SOLUTION
Avatar of Terry Woods
Terry Woods
Flag of New Zealand image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Generally when running queries on databases, there is a substantial overhead for each query you run. The algorithm I just gave would reduce that overhead from order-n queries to order-log-n queries, thus reducing that overhead substantially where a large number of results are returned.
I definitly aggree on the fact that your recursive function is much "lighter" then mine, but the point is to have it in an SQL query in order to use the SQL functionalities such as sorting and group by on them an on their belongings...
Once you've got an array of all the identities, you can easily build queries which select filtering on those identities only, such as what I did in the recursive function. That way, you'd be able to sort and group with any query you like.
>Another flaw I forgot to mention about the transversal approach is the case of a child having two parents, which on the simple thought of it is giving me headackes...

In a hierarchy, a child cannot have two parents. In the traversal approach, the parent is the row where max(left) < child and right > child.    There can only be one.  If you need to represent a child with more than one parent, then the traversal approach is not appropriate.
An entirely different approach, the "Nested Set Model":

Managing Hierarchical Data in MySQL
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

Never took the time to actually implement in real-life data, but I had fun playing with it. It seems is beats the simple parent-child model (called the "Adjacency List Model" in the article) for querying, with some overload for editing though.

Cheers!
(°v°)
This is the method discussed in the post just ahead of you... Nice try thought ;)
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
To TerryAtOpus:

I am already doing it this way, but it's a hot fix but not a long term solution... My database hasn't been live for more than a month, and I'm already getting slow.

I need the speediest solution I can find...
To harfang:

It is about items hierachy and not about path... Final result should avoid recursivity if possible, and therefore I was thinking paths might be usefull but so far all my tests were too heavy in storing the information and too slow on updates as parents still need to be updated.

What I would need would be the "preorder tree traversal algorithm" without its  flaw but didn't have the expected success so far...
To dqmq:

Yes the approach is still valid with several parents, in a relational DB. You just need a binding table to store the lft and rgt property to have an  illimited amount of children (don't tell it to your wife, she might get some bad ideas! ).
Sorry about the humor... Some lack of sleep and some excess of cafeine ;)
>Yes the approach is still valid with several parents, in a relational DB. You just need a binding table to store the lft and rgt property to have an  illimited amount of children

Of course, hierarchies can have unlimited children--that's not the issue.  Multiple parents, on the other hand, is not allowed in a hierarchy.  If you have multiple parents, then none of the hierarchical models will work--not nested set, not adjacency list, and not enumerated list.   If you have multiple parents, then you need a network model, not a hierarchical model.

   



 
ASKER CERTIFIED SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
Sounds very satisfying to my needs... I need to implement and test it to get practical impact of such a model on my query speed on updates and delete. Need some time to check it. But so far I think this was probably what I looked for.
Ok I'm still thinking of improving it. But it's the best I have seen so far :D Might bug people again if my db gets slow... But works fine for now...

Thanx a LOT!
Comment to help people searching on this topic:

1-Definive answer was the one from dgmg.

2-The reply of TerryAtOpus can be very usefull as well if you are looking for a speedy recursive php solution, mine needed to be mysql but it's definitly worth a go in other cases!