• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1297
  • Last Modified:

recursive mysql query

Dear experts,

I have a table with entities. They have parent child relations, i.e. so entity "dad" would be entity 1 have parent 0, and child "firstC" would have as parent "dad" and could have a child "first_first_child" and so on...

entities        |    entities_parent_child  |    
----------       |   ---------------------------   |  
identities     |   idparent                       |  
name           |   identity                        |  
---------        |  ----------------------------  |

Selecting first child of a parent would be:
SELECT *.e FROM entities AS e, entities_parent_child AS p WHERE e.identity=p.entities AND idparent = "XYZ_id"

How can I make a query getting all child of first child without using php?
I know it has to be something like:
SELECT *.e FROM entities AS e, entities_parent_child AS p WHERE e.identity=p.entities AND idparent = "XYZ_id"  OR e.identities= ANY (SELECT * FROM entities WHERE identities= "FIRST CHILD ID")


What I would like is following in a single Mysql query:

let's say entity with id 525 has as name BOB wich has childs I wanna know the names of:

function get_childs($parent)
{
$q="SELECT identity FROM entity WHERE identity= ANY (SELECT identity FROM entities_parent_child WHERE idparent='525')";
$query=mysql($q);

 while ($row = mysql_fetch_array($query))
 {
echo "<table><tr><td>-</td></tr>";
  echo "<tr><td>entity name:".$row['name']."</td></tr>";
get_childs($row['identity']);
echo "</td></tr></table>";
 }
}

get_childs('525');

Will get me result:
BOB
-linda
--Nuria
---Brian
---Marcel
----Elvis
-susie
--Henry
-marcel

And it will get me childs of bob (linda, susie and marcel) and child of childs (linda has nuria as child) and childs 3rd genration (nuria is parent of Brian)

It will take all bob's decendance. I would like to have this into a single query to be able to use SQL features...

Doest anyone have an idea on how to achieve it with a query of reasonable duration?
0
ChoobsTech
Asked:
ChoobsTech
  • 12
  • 5
  • 4
  • +1
3 Solutions
 
dqmqCommented:
This is a followup to a question I already worked on.  I am not looking for more points, just to clarify the requirements as I understand them.  

The author has an Entities table and an Entities_Parent_Child table which is used to represent a hierarchy of Entities.  The latter table as a row for each child entity that references its parent.  In otherwords, it has two columns: EntityID (EntityID of the child) and ParentID ( EntityID of the parent)

Given a known Entity, the requirements are to produce a list of that Entity and all it's descendents using SQL.   I could not help because I do know how to do a recursive query in MySQL.



0
 
Terry WoodsIT GuruCommented:
Looks like you may be able to do what you want with a stored procedure, although a google search for
  mysql stored procedure recursive
seemed to indicate it might be a bit buggy! Are you interested in doing that?

Alternatively, I think you could create a single SQL (possibly a messy one - have to think about that some more...) provided that you set a maximum limit for depth into the tree hierarchy. Are you willing to do that?
0
 
ChoobsTechAuthor Commented:
Yes a recursive stored procedure might be of great help :)

I made some research....

Found some interresting stuff but it's still not exactly what I want on following link:
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
http://www.sitepoint.com/article/hierarchical-data-database

the 2 methods they advance:

1-The Adjacency List Model
 It's more or less what I am doing now and it is has many flaws, I cannot use it as I have an unlimited hierarchy.
2-preorder tree traversal algorithm
 I have tested it and is too slow on updates as I work with ajax.

My idea so far is to store the "family" names as
1.0 for first parent, 1.1 for second parent
1.0.1 for first child, 1.0.2 for second child, and so on...

Therefore to select whole genealogic tree,  would be SELECT * FROM entity where family="1.0%" but this is storing high amount of data which is not suitable...
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
dqmqCommented:
I believe your "family" names approach is generally referred to as an "enumerated list" (though you can express the hierarchy in words just as well--like a hard drive directory structure).  

The main disadvantages besides space are that you don't get RI and many times navigating the hierarchy requires substring manipulation and tends to defeat indexing for performance.  For example, while it might be easy to list all descendents in a single SQL statement, listing all anscestors in a single SQL statement is not so easy.

The tree traversal approach is indeed costly for updates.  You can mitigate that somewhat with a modified version that permits gaps in the numbering scheme.    
0
 
ChoobsTechAuthor Commented:
Thank you for your precision :)

I already figured out some flaws of my "path" (enumerated list) approach, tried to improve it, getting rid of the obvious data, i.e. the data in between which made me come right back in a transversal approach.

At current state I am thinking transveral approach is the only satisfying solution (for what I want do, I do believe other methods are viable depending on situations).

But it's still not satisfying for me, as it is ridicoulous to update almost all your scheme just to add a single element, so if you have a "somewhat modified version" or any other idea it is more than welcome as I spent the last 24hrs on it and feel quite stuck to be honnest :(
0
 
ChoobsTechAuthor Commented:
Another flaw I forgot to mention about the transversal approach is the case of a child having two parents, which on the simple thought of it is giving me headackes...
0
 
Terry WoodsIT GuruCommented:
I think you might find a significant improvement if you put your algorithm above into a stored procedure.

Alternatively, you could vastly reduce the number of queries you run, like the following (note I haven't saved the depth into the tree or the names, but you could get that back easily enough):

function get_childs($parents) {
  $q="SELECT identity FROM entity WHERE identity= ANY (SELECT identity FROM  
          entities_parent_child WHERE idparent in ";

  $first = true;
  foreach ($parents as $parent) {
    if ($first) {
      $q = "$q '$parent'";  #no comma
      $first = false;
    } else {
      $q = "$q, '$parent'"; #extra comma
    }
  $q = "$q)"; #add the bracket to the end

  $query=mysql($q);

  $children = array();
  while ($row = mysql_fetch_array($query)) {
     #build array:
     $children[] = $row['identity'];
  }

  if (count($children)>0) {
    return array_merge($children, get_childs($children));
  } else {
    return null;
  }
 
}

echo "<table>";
foreach (get_childs(array("person")) as $person) {
  echo "<tr><td>entity id:".$person."</td></tr>";
}
echo "</table>";
0
 
Terry WoodsIT GuruCommented:
Generally when running queries on databases, there is a substantial overhead for each query you run. The algorithm I just gave would reduce that overhead from order-n queries to order-log-n queries, thus reducing that overhead substantially where a large number of results are returned.
0
 
ChoobsTechAuthor Commented:
I definitly aggree on the fact that your recursive function is much "lighter" then mine, but the point is to have it in an SQL query in order to use the SQL functionalities such as sorting and group by on them an on their belongings...
0
 
Terry WoodsIT GuruCommented:
Once you've got an array of all the identities, you can easily build queries which select filtering on those identities only, such as what I did in the recursive function. That way, you'd be able to sort and group with any query you like.
0
 
dqmqCommented:
>Another flaw I forgot to mention about the transversal approach is the case of a child having two parents, which on the simple thought of it is giving me headackes...

In a hierarchy, a child cannot have two parents. In the traversal approach, the parent is the row where max(left) < child and right > child.    There can only be one.  If you need to represent a child with more than one parent, then the traversal approach is not appropriate.
0
 
harfangCommented:
An entirely different approach, the "Nested Set Model":

Managing Hierarchical Data in MySQL
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

Never took the time to actually implement in real-life data, but I had fun playing with it. It seems is beats the simple parent-child model (called the "Adjacency List Model" in the article) for querying, with some overload for editing though.

Cheers!
(°v°)
0
 
ChoobsTechAuthor Commented:
This is the method discussed in the post just ahead of you... Nice try thought ;)
0
 
harfangCommented:
Ouch! I didn't recognize "preorder tree traversal algorithm" as being the same thing, sorry about that.

I see that most of this thread is basically about "building the path". I use two methods for that. Of course this does not apply your original table structure allowing for several parents. That would not be a hierarchy anymore, as pointed out already. Surfing a genealogy is a good example, where you build partial trees (e.g. find all uncles of a person). This becomes more an AI inference engine problem, which is cute but tricky using SQL.

Anyway, given: Items ( ID, ParentID, Tree ) a Null ParentID being a root:


Loop of update queries

    UPDATE Items SET Tree = Null;

Then,

    UPDATE Items P INNER JOIN Items C ON P.ID=C.ParentID
    SET C.Tree = P.Tree & '/' & P.ParentID
    WHERE C.Tree Is Null;

Repeat until no records are affected. This is surprisingly fast. Note that cycles will be immediately visible as records having a ParentID but no Tree.


Move upstream

I have no idea how to do that in MySQL, but the idea would be to recursively find the parent until null, building the path from right to left. This is again very fast if you can use the table's index to seek the records. I did that several times using VB and Access, to good results (tested up to 150k records).

The advantage is that the tree does not need to be actually stored for this. This does need a failsafe for cycles (set max iterations or stop upon finding the same ID again).

The disadvantage is that it's not efficient to produce a sort order.


I should really not join in a MySQL thread, but here you go...
(^v°)
0
 
ChoobsTechAuthor Commented:
To TerryAtOpus:

I am already doing it this way, but it's a hot fix but not a long term solution... My database hasn't been live for more than a month, and I'm already getting slow.

I need the speediest solution I can find...
0
 
ChoobsTechAuthor Commented:
To harfang:

It is about items hierachy and not about path... Final result should avoid recursivity if possible, and therefore I was thinking paths might be usefull but so far all my tests were too heavy in storing the information and too slow on updates as parents still need to be updated.

What I would need would be the "preorder tree traversal algorithm" without its  flaw but didn't have the expected success so far...
0
 
ChoobsTechAuthor Commented:
To dqmq:

Yes the approach is still valid with several parents, in a relational DB. You just need a binding table to store the lft and rgt property to have an  illimited amount of children (don't tell it to your wife, she might get some bad ideas! ).
0
 
ChoobsTechAuthor Commented:
Sorry about the humor... Some lack of sleep and some excess of cafeine ;)
0
 
dqmqCommented:
>Yes the approach is still valid with several parents, in a relational DB. You just need a binding table to store the lft and rgt property to have an  illimited amount of children

Of course, hierarchies can have unlimited children--that's not the issue.  Multiple parents, on the other hand, is not allowed in a hierarchy.  If you have multiple parents, then none of the hierarchical models will work--not nested set, not adjacency list, and not enumerated list.   If you have multiple parents, then you need a network model, not a hierarchical model.

   



 
0
 
dqmqCommented:
There is an interesting variation to nested set that you might find useful.  I don't know the name, but it goes like this.  Instead of left and right columns, use sequence and level columns.  The sequence column is exactly the same as the left column in nested set.  The level column is the level in the hierarchy, i.e. root level=0, children of root=1, and so forth.

The query to produce a list of all descendents is,

Select * from yourtable p, yourtable c
  where p.id=whatever
     and c.level >= p.level
     and c.sequence <=
          (select coalesce(min(sequence),c.sequence)
           where sequence > c.sequence
               and level < c.level)

If you want to indent the levels, that's easy enough since you have level number in each row.  If you want to minimize the overhead of updating, then increment the sequence number by 10.  That permits you to insert 10 children of the same parent before you need to do the mass-update thing. Deleting a row is trivial--just delete it.  






 




 
0
 
ChoobsTechAuthor Commented:
Sounds very satisfying to my needs... I need to implement and test it to get practical impact of such a model on my query speed on updates and delete. Need some time to check it. But so far I think this was probably what I looked for.
0
 
ChoobsTechAuthor Commented:
Ok I'm still thinking of improving it. But it's the best I have seen so far :D Might bug people again if my db gets slow... But works fine for now...

Thanx a LOT!
0
 
ChoobsTechAuthor Commented:
Comment to help people searching on this topic:

1-Definive answer was the one from dgmg.

2-The reply of TerryAtOpus can be very usefull as well if you are looking for a speedy recursive php solution, mine needed to be mysql but it's definitly worth a go in other cases!
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 12
  • 5
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now