[Webinar] Streamline your web hosting managementRegister Today

x
Solved

# request graph algorithmic

Posted on 2004-08-06
Medium Priority
323 Views
Do you know Friendster or Orkut, a social networking connected people together, I'm interesting some algorithm in the site:

Can anybody tell me how to calculate quickly to determined the count of people in everyone's personal networking. I think the breadth first search is very slowly.
0
Question by:walkdan
• 3
• 2

LVL 11

Expert Comment

ID: 11739304
By personal network you mean the connected component (of the social network graph) that the person is in, right? That is, the count of people that can be reached through "associate" links.

I am making sure I understand what you are asking for.

-bcl
0

Author Comment

ID: 11741061
The count of people is calucalte 4 degree. for example, My 1 degree of people is direct connected to me, 2 degree of people is connected through one people. how many people in my 4 degree networking.
0

LVL 11

Expert Comment

ID: 11742952
How often is the graph changed? I ask because I think breadth-first is the best solution (especially a depth-limited breadth-first search). If the graph changes regularly, then you can do the search at the moment of query. If the graph is MOSTLY static, that is there are lots and lots of queries you can update the count when you insert a node or add a connection (again, depends on the relative frequency of operations).

Other thoughts:
It is possible to keep a list of distance n neighbors (probably not too space intensive) though in social networks there is a large percentage of the graph at distance 4. Probably easier to just keep the count (perhaps as a count at each distance).

-bcl
0

LVL 17

Accepted Solution

davebytes earned 500 total points
ID: 11743383
Generally, each node stores the number of exit paths.  So at degree 1, you know your count, degree 2, you follow out each path and get their counts and add them, etc.

Breadth vs depth doesn't really make a difference, as in all of these programs (that I've seen), the numbers are approximate -- i.e., I don't know that they eliminate duplicate friends from counts.  And, since they only tell you to degree 4 or so anyway, the search doesn't take that long.

It is also possible that these systems actually store both degree 1 AND degree 2 counts at every node (updates to one node push to all connected), so that traversal/calc is THAT much faster.

-d
0

Author Comment

ID: 11745405
The count can precalculated and store to database, when user request the count, just pick it from database and return immediately.

But the precalculation is also unefficient if do breadth-frist travel every time to get everyone's count.
Can way get the counts only one or several travel times.
0

LVL 11

Assisted Solution

ID: 11746570
(1) Depth-first versus breadth-first traversal will vist the same number of nodes (assuming it is not depth limited, each will visit every node in the connected component where the search starts). If answers are approximate, precalculating the depth 2 numbers means that you can get the "number" you want with a breadth first sum of depth 2 friends of your friends' friends.

(2) Pre-calculation is not that expensive IF updates are relatively rare. Yes, pushing the change through 4 layers is expensive but if you think about it you really only update when a new connection is added (or one is removed) and that means that we are really already 1 ply down in the graph (we know the new friend). If we're not worried about overcounting, we can just move down to all their friends, their friends' friends, and their friends' friends' friends, counting them AND incrementing their counts (we added a new friend to the group). If we're worried about overcounting, then we need some list of all of the members of the group and that would require calculating _or_storing it. That is a full depth 4 traversal OR a fair amount of storage. We can use a depth 2 set of friends  to quickly build the depth 4 set (just union all the depth 2 neighbor's depth 2 sets, counting on union to handle duplicates).

-bcl
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently, in one of the tech-blogs I usually read, I saw a post about the best-selling video games through history. The first place in the list is for the classic, extremely addictive Tetris. Well, a long time ago, in a galaxy far far away, I was…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
Kernel Data Recovery is a renowned Data Recovery solution provider which offers wide range of softwares for both enterprise and home users with its cost-effective solutions. Let's have a quick overview of the journey and data recovery tools range he…
Is your organization moving toward a cloud and mobile-first environment? In this transition, your IT department will encounter many challenges, such as navigating how to: Deploy new applications and services to a growing team Accommodate employee…
###### Suggested Courses
Course of the Month9 days, 11 hours left to enroll