Avatar of catalini
catalini asked on

Breadth First Search reading data from a mysql table

I'm trying to calculate distance on a graph using the Breadth First Search algorithm. I've seen plenty of implementations online, but I need one that would allow me to read data from my mysql database.

I've a table structured like this:

INDIVIDUAL1         INDIVIDUAL2
---------------------------------------
AH2223423           AS38423842
AH2223423           HS2772722
AH2223423           IASD727272
HGS282828          AS38423842

where every row corresponds to an edge (e.g. row1 means that  AH2223423 and AS38423842 are connected) on the graph and every INDIVIDUAL (e.g. AH2223423) is a vertex

If necessary I could export the table as csv and use it directly as a file.
PerlPythonJava

Avatar of undefined
Last Comment
catalini

8/22/2022 - Mon
ASKER
catalini

probably the best code is here

http://code.google.com/p/python-graph/

but I'm not sure how to load the data....
ozo

Is it an directed graph, or should we assume a path from  AS38423842 to AH2223423
even if there is no entry like
INDIVIDUAL1         INDIVIDUAL2
---------------------------------------
AS38423842         AH2223423
Adam314

It looks as if it is an unweighted graph... meaning each edge is the same distance.  Is this correct?
If the database is small enough that the entire graph can fit in memory, I would first create a hash, where the key is the vertex, and the value is a hash ref or array ref of neighbors.  This would then make it very easy to do the search.

What part would you like help with?
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
ASKER
catalini

ozo:
it is undirected, if there is a path between

INDIVIDUAL1         INDIVIDUAL2
---------------------------------------
AS38423842         AH2223423

then the reverse is also true (i.e  AH2223423 is connected to AS38423842)


adam314:
yes each edge is the same distance. the table is 240,000 rows (edges)
I've found a nice python script http://code.google.com/p/python-graph/ 
but would like to know how to use it in case with my mysql table...
(if you know a better script let me know!)

thanks!!!



#!/usr/bin/env python
 
# Copyright (c) 2007 Pedro Matiello <pmatiello@gmail.com>
# License: MIT (see COPYING file)
 
import graph
 
print "Example"
# Graph creation
gr = graph.graph()
 
# Add nodes and edges
gr.add_nodes(["Portugal","Spain","France","Germany","Belgium","Netherlands","Italy"])
gr.add_nodes(["England","Ireland","Scotland","Wales"])
gr.add_nodes(["Atlantis","Wonderland","Brazil","Lemuria","Ys"])
gr.add_edge("Portugal", "Spain", wt=1)
gr.add_edge("Spain","France", wt=3)
gr.add_edge("France","Belgium", wt=2)
gr.add_edge("France","Germany", wt=3)
gr.add_edge("France","Italy", wt=3)
gr.add_edge("Belgium","Netherlands", wt=1)
gr.add_edge("Germany","Belgium", wt=2)
gr.add_edge("Germany","Netherlands", wt=2)
gr.add_edge("Germany","Italy", wt=3)
gr.add_edge("England","Wales", wt=1)
gr.add_edge("England","Scotland", wt=2)
gr.add_edge("Scotland","Wales", wt=1)
 
gr.add_edge("Atlantis","Wonderland")
gr.add_arrow("Atlantis","Brazil")
gr.add_edge("Atlantis","Lemuria")
gr.add_arrow("Lemuria","Ys")
gr.add_arrow("Ys","Wonderland")
gr.add_arrow("Ys","Brazil")
 
print "------------------------------------------------------------------------"
print "Breadth First Search"
print gr.breadth_first_search()

Open in new window

ASKER
catalini

one problem with the script above is that it just returns the connected elements, not the distance on the graph...
Adam314

You could use the DBI module to connect to your database
http://search.cpan.org/~timb/DBI-1.604/DBI.pm

And the Graph module to create a graph
http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod

And the Graph::Traversal::BFS to do a breadth-first-search
http://search.cpan.org/~jhi/Graph-0.84/lib/Graph/Traversal/BFS.pm

Each of those modules has documentation.  If you would like some sample code using those modules, let me know.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Adam314

Here is some sample code...
use DBI;
use Graph;
use Graph::Traversal::BFS;
 
#Connect to database
#you will need to define the $data_source, $username, and $password
my $dbh = DBI->connect($data_source, $username, $password) or die "Could not connect: $DBI::errstr\n";
 
#Created undirected graph
my $graph = Graph->new(undirected=>1);
 
#Get data from database
my $sth = $dbh->prepare("SELECT INDIVIDUAL1, INDIVIDUAL2 FROM table")
  or die "Could not prepare: $DBI::errstr\n";
$sth->execute;
 
#create and edge according to db (a vertex is automatically created if it doesn't already exist)
while(my ($a, $b) = $sth->fetchrow_array) {
	$graph->add_edge($a, $b);
}
 
#Create traversal object (see documentation for details)
my $b = Graph::Traversal::BFS->new($graph);
$b->bfs;

Open in new window

SOLUTION
ozo

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
catalini

ozo:

I get a syntax error at distance.pl line 4, near "){"

I changed the connect line to

DBI->connect("dbi:mysql:database=test;host=localhost;user=root;password=########")


ASKER
catalini

adam:

is there a way to save the distances between each vertex pair back into a mysql table?
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
ASKER
catalini

adam:

maybe I could use this?

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph/AdjacencyMatrix.pm

not sure how to add it to your code...

thanks!
Adam314

Saving back to mysql, do you want to create a new table with every possible vertex combination, and the distance to them?
ASKER
catalini

yes that would be great! I already have a table called DISTANCES with structure

INDIVIDUAL1, INDIVIDUAL2, DISTANCE
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER CERTIFIED SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
catalini

adam, thanks! the procedure is almost perfect. Do you think there is a way to write only the positive distances? in this way it's writing also all the distances that are infinite (NULL)... that would probably speed things a little bit,

thanks!
ASKER
catalini

furthermore it seems to crash my system, using all available resources (memory and 4GB of swap)...   :-(
ASKER
catalini

thanks!
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
Adam314

To skip those with NULL distances, you could put this before line 28:
    my $len = $apsp->path_length($a, $b);
    next unless $len > 0;

As for it crashing... computing the shortest distance from each vertex to each is very memory/time consuming.  I'm not sure what to do about that.  One thing that might help is to compute the shortest distance from one vertex to each, and add that to the database, then release it from memory, and go to the next node.  Look in the Single-Source Shortest Path section:
http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod#Single-Source_Shortest_Paths_(SSSP)
ASKER
catalini

probably SSSP_Dijkstra  is the way to go...
I've opened a similar question to solve this problem, maybe you can take a look ?
https://www.experts-exchange.com/Programming/Languages/Scripting/Perl/Q_23433195.html

thanks adam