[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 535
  • Last Modified:

I need a SQL statement to select a person and their buddies and their buddies buddies

I have a table in Oracle that contains buddy list information. It contains the user's buddies. The columns are "username" and "buddy". Data looks like this (listing Mary and Sue's buddies:

USERNAME   BUDDY
Mary              Bob
Mary              Sue
Mary              Bill
Sue                Bill
Sue                Joe

 I am trying to write a query that starts with a specific username and finds all of their buddies. Then I want to take the result set and find all the buddies for each person in the result set; and possibly do it once more. I would like to do this in a single SQL statement. I thought it might be a CONNECT BY query, but this isn't really a hierarchy, since everyone could be a parent and a child of everyone else.
0
rostara
Asked:
rostara
  • 8
  • 5
  • 3
  • +3
1 Solution
 
Kevin CrossChief Technology OfficerCommented:
You can use a recursive common table expression (cte -- WITH).
http://iggyfernandez.wordpress.com/2010/02/28/i-heart-recursive-subquery-factoring/

Kdo wrote this EE article for DB2, but the explanation of CTE recursion should be portable to Oracle and aid in your understanding of the concept.
http://www.experts-exchange.com/Database/DB2/A_3618-Recursive-SQL-in-DB2-Converting-rows-to-columns.html

See below.  Note you can change ub.username to bl.username if you want the results to come reflect all your buddies as friends of whomever you selected first, but given you are selecting by a specific username you would probably just do a final select of buddy anyway...

You can add in a level column that starts at 0 or 1 for the first select and then in the recursive piece do level+1 so you can control how many levels deep you go.  I am more of a MS SQL person where there is a MAXRECURSION option that I can't remember if is available for Oracle.

Hope that makes sense.
with buddyList as (
   select username, buddy
   from user_buddies
   where username = 'Mary'

   union all 

   select ub.username, ub.buddy
   from buddyList bl
   join user_buddies ub on ub.username = bl.buddy
)
select distinct buddy
from buddyList
;

Open in new window

0
 
Kevin CrossChief Technology OfficerCommented:
And as Kdo indicates in if you think there is a possibility for an infinite loop just be careful.  For example if Mary is the buddy of one of her buddies this could end up in a loop.

At least with top level person it is easy to filter as you can add something like this:
with buddyList as (
   select username, buddy
   from user_buddies
   where username = 'Mary'

   union all 

   select ub.username, ub.buddy
   from buddyList bl
   join user_buddies ub on ub.username = bl.buddy
   and ub.buddy <> bl.username
)
select distinct buddy
from buddyList
;

Open in new window

0
 
rgeersCommented:
You mean something like:

select BUDDY from BUDDYLIST where USERNAME=(select BUDDY from BUDDYLIST where USERNAME =$USER)
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
Devinder Singh VirdiLead Oracle DBA TeamCommented:
What about this:-

select user_name, buddy from user_buddies where user_name ='Sue'
union all
select user_name, buddy from user_buddies where user_name in
  (select buddy from user_buddies where user_name ='Sue' )

0
 
sdstuberCommented:
unfortunately, recursive WITH clauses aren't supported in Oracle until 11gR2 and this is a 9i question

since this IS a hierarchy you can do it with CONNECT BY.


SELECT buddy
  FROM buddies
  START WITH  username = 'Mary'
  CONNECT BY username  = PRIOR buddy;


however, this assumes you won't have any loops in your data
for instance,  if Bob is Mary's buddy,  is Mary  Bob's buddy?

assuming she is, or a loop might be caused later such as Joe having Mary as a buddy.

Normally I'd suggest using CONNECT BY NOCYCLE to address this issue, but NOCYCLE doesn't exist until 10g and this is a 9i question

But, this is also easily solved, simply remove your starting user from the connect by clause, and limiting the level which you wan't to
do anyway since you are only going as deep as buddies' buddies  (level 2)


and that leaves you with this...



SELECT DISTINCT buddy
      FROM buddies
START WITH username = 'Mary'
CONNECT BY buddy != 'Mary' AND username = PRIOR buddy AND LEVEL <= 2

Open in new window

0
 
Kevin CrossChief Technology OfficerCommented:
Very nice, Sean! Thanks for chiming in, I didn't spot the 9i zone there which is important as you pointed out.
0
 
sdstuberCommented:
thanks, I saw this question come across my phone a few hours ago but didn't get a chance to get back to it until now.
I remembered the asker saying connect by couldn't be used and it stuck in my head "why not?"  
So I had to check in and see if the thread was still open or not.
0
 
cyberkiwiCommented:
http:#a33705833 would be my pick for being dead simple and nothing sophisticated or complicated (requiring specific features of this or that Oracle server version or other DBMS).
Just plain Ansi SQL.

Don't know why Mary became Sue, but I would change the Union All to just Union; Union removes duplicate results, e.g Mary->Sue->Mary.

select username, buddy from tbl_user_buddy where username ='Mary'
union
select username, buddy from tbl_user_buddy where username in
  (select buddy from tbl_user_buddy where username ='Mary' )
0
 
sdstuberCommented:
The query in 33705833 requires 3 accesses to the same table,  it's an IO hog.
Indexes may help, but those could be used by the connect by too.

Since this is an Oracle question, trying to accomodate the limitations of other platforms seems counterproductive.
Plus,  the oracle syntax is hardly complicated,  4 lines vs 4 lines.  Seems pretty simple to me

However, I'll never discourage testing.  
Definitely try all suggestions,  remember to check resource consumption as well as clock time so you can get a better idea of scaling.
The connect by version does require more sorts, but those are almost always going to be less expensive than extra io.
0
 
cyberkiwiCommented:
Sean,

Since you are the resident Oracle expert on this question, could you explain why you would comment that 33705833 (and 33709460 for that matter) "accesses" the table 3 times?  I thought that is by definition what CONNECT BY.. PRIOR does.  The PRIOR means that it is working off a *PRIOR* stage of the result set.  Although it is elegantly written, yet it also accesses the table 3 times unless I am mistaken?

Table access 1 : Start by "mary" - index on username (seed)
Table access 2 : Collect Mary's buddies (level 1)
Table access 3 : Collect buddies of Mary's buddies (level 2)
0
 
cyberkiwiCommented:
I was thinking about this while driving (as you do), and I can see now why CONNECT BY would work better.
In summary, it accesses the table twice, right?

Table access 1 : Collect Mary's buddies (level 1)
Table access 2 : Collect buddies of Mary's buddies (level 2) - except Mary
DISTINCT (in case Mary's buddies share the same buddies)

Using 33709460, you get

Table access 1 : Collect Mary's buddies
Table access 2 : Collect Mary's buddies ( again - as subquery for 3 )  <<< extra
Table access 3 : Collect the buddies of the subquery in #2
UNION (removes duplicates)

The 2nd step is required without CONNECT BY, to generate a subquery for the next level.  The query didn't exclude Mary anyway. For completeness only:

select username, buddy from tbl_user_buddy where username ='Mary'
union
select username, buddy from tbl_user_buddy where username in
  (select buddy from tbl_user_buddy where username ='Mary' )
and buddy != 'Mary'
0
 
sdstuberCommented:
In     33705833  I say it acceses the table 3 times because there are 3 distinct select statements

select user_name, buddy from user_buddies where user_name ='Sue'   -----   1
union all
select user_name, buddy from user_buddies                                          ---- 2
where user_name in  
  (select buddy from user_buddies where user_name ='Sue' )                ------- 3

even those #1 and #3 are very similar they are not identical and when I do an explain plan I see 3 accesses

the connect by only accesses the table once.  the "connect by" is an internal sorting operation.  It's certainly not a "free" operation by any means.
That's why I encourage testing all the options.  In my tests, on my database,  the connect by did less work, and corroborated my expectations. It seemed fairly obviously superior.

With the asker's data volumes, memory,hash area and indexes the other version "might" be better.  But I don't think so. It's pretty clear the triple query version
will always do more IO (consistent gets);  and I "think" that will be the greatest performance/scaling inhibitor.  I could be wrong.

I might be the "resident expert" but I've had my fair share of misreads, brain farts, typos and other mistakes; and I expect I will in the future too, so by all means test everything I say.  :)



0
 
Kevin CrossChief Technology OfficerCommented:
Good points, gentlemen.  Another reason, why I had started down the path of recursive CTE (i.e., a more complex approach) is that the Author says that the levels may be increased.  For the union approach, you would have to keep growing the number of subqueries/accesses where as recursion or the connect by can handle this growth gracefully...maybe performance would be same, but efficiency of the programmer and code maintenance I believe to be better.
0
 
Kevin CrossChief Technology OfficerCommented:
"...finds all of their buddies. Then I want to take the result set and find all the buddies for each person in the result set; and possibly do it once more."

I took that as:
(1) find Mary's buddies;
(2) find Mary's buddies' buddies;
(3) find Mary's buddies' buddies' buddies!

phew! That is why recursion was first to come to mind, but as Sean said for 9i connect by has to be used and this is a valid example of hierarchy.  Yes, the above can be written with UNION, but why when later we may add (4).
0
 
rostaraAuthor Commented:
Thanks to everyone for the numerous comments and variety of approaches. I inadvertently listed this as a 9i question, when in fact it is a 10g environment, so the CONNECT BY PRIOR NOCYCLE approach may be the most appropriate option. I will test it this afternoon and post the results.
0
 
sdstuberCommented:
If you might be going beyond 2 levels then you MUST use NOCYCLE on the connect by.

If you don't, then you need to use some other strategies, in 9i code this could be very tricky to go beyond 2 levels
0
 
rostaraAuthor Commented:
The SQL in 33706485 seems to do the trick, but my table structure is a little different than the original set up. There are in fact, 2 tables. The first is an ACCOUNT table which lists an account_id and username for each user. the second is the BUDDY table, which has a username (the user) and an account_id (which is the buddy's account_id and FK). The structure of the table seems a little backward to me, but I'm stuck with it. Here is a sample:

ACCOUNT
ACCOUNT_ID    USERNAME
            1                MARY
            2                 BOB
            3                 SUE
            4                 BILL
            5                  JOE

BUDDY
ACCOUNT_ID     USERNAME
            2                   MARY
            3                   MARY
            4                   MARY
            4                   SUE
            5                   SUE

So, Mary has Bob, Sue, and Bill (2,3,4) as buddies and Sue has Bill and Joe (4,5) as buddies.

I'm having trouble appying the CONNECT BY in the join of these tables.
0
 
sdstuberCommented:
where I have "buddylist" above  put this...

(select b.username, a.username buddy
from account a, buddy b
where a.account_id = b.account_id)
SELECT DISTINCT buddy
FROM (SELECT b.username, a.username buddy
      FROM account a, buddy b
      WHERE a.account_id = b.account_id)
START WITH username = 'MARY'
CONNECT BY buddy != 'MARY' AND username = PRIOR buddy AND LEVEL <= 2

Open in new window

0
 
sdstuberCommented:
oops,  "buddies"  not "buddylist"
0
 
rostaraAuthor Commented:
I have a similar, but more complex scenario this time. I'm trying to find the door a person went through and then all the other people that went through those doors, and then the people that went through the doors that the people above them did. Here is the sample tables:

DOOR    PERSON
3       a
3       b
3       c
3       d

4       a
4       c
4       x
4       y

5       b
5       d

6       x
6       z

7       c

So we start with Person "a" going through door 3.
Who else went through door 3? "b, c, d"
Now we want to know what other doors "b, c, and d" went through. "b and d" went thorough 4 and "c" went through 7.
Continuing the iteration, we want to know what doors "x & y" went through. And so on... I hope this makes sense.

This may be another representation:

a --> b <--> d
   --> c <--> a
             --> x --> z
             --> y
   --> d
0
 
sdstuberCommented:
please click the "ask a related question" link and open a new question for this addendum
0

Featured Post

Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

  • 8
  • 5
  • 3
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now