catalini asked on # Breadth First Search reading data from a mysql table

I'm trying to calculate distance on a graph using the Breadth First Search algorithm. I've seen plenty of implementations online, but I need one that would allow me to read data from my mysql database.

I've a table structured like this:

INDIVIDUAL1 INDIVIDUAL2

---------------------------------------

AH2223423 AS38423842

AH2223423 HS2772722

AH2223423 IASD727272

HGS282828 AS38423842

where every row corresponds to an edge (e.g. row1 means that AH2223423 and AS38423842 are connected) on the graph and every INDIVIDUAL (e.g. AH2223423) is a vertex

If necessary I could export the table as csv and use it directly as a file.

I've a table structured like this:

INDIVIDUAL1 INDIVIDUAL2

--------------------------

AH2223423 AS38423842

AH2223423 HS2772722

AH2223423 IASD727272

HGS282828 AS38423842

where every row corresponds to an edge (e.g. row1 means that AH2223423 and AS38423842 are connected) on the graph and every INDIVIDUAL (e.g. AH2223423) is a vertex

If necessary I could export the table as csv and use it directly as a file.

PerlPythonJava

ozo

Is it an directed graph, or should we assume a path from AS38423842 to AH2223423

even if there is no entry like

INDIVIDUAL1 INDIVIDUAL2

---------------------------------------

AS38423842 AH2223423

even if there is no entry like

INDIVIDUAL1 INDIVIDUAL2

--------------------------

AS38423842 AH2223423

Adam314

It looks as if it is an unweighted graph... meaning each edge is the same distance. Is this correct?

If the database is small enough that the entire graph can fit in memory, I would first create a hash, where the key is the vertex, and the value is a hash ref or array ref of neighbors. This would then make it very easy to do the search.

What part would you like help with?

If the database is small enough that the entire graph can fit in memory, I would first create a hash, where the key is the vertex, and the value is a hash ref or array ref of neighbors. This would then make it very easy to do the search.

What part would you like help with?

All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat

William Peck

catalini

ozo:

it is undirected, if there is a path between

INDIVIDUAL1 INDIVIDUAL2

---------------------------------------

AS38423842 AH2223423

then the reverse is also true (i.e AH2223423 is connected to AS38423842)

adam314:

yes each edge is the same distance. the table is 240,000 rows (edges)

I've found a nice python script http://code.google.com/p/python-graph/

but would like to know how to use it in case with my mysql table...

(if you know a better script let me know!)

thanks!!!

it is undirected, if there is a path between

INDIVIDUAL1 INDIVIDUAL2

--------------------------

AS38423842 AH2223423

then the reverse is also true (i.e AH2223423 is connected to AS38423842)

adam314:

yes each edge is the same distance. the table is 240,000 rows (edges)

I've found a nice python script http://code.google.com/p/python-graph/

but would like to know how to use it in case with my mysql table...

(if you know a better script let me know!)

thanks!!!

```
#!/usr/bin/env python
# Copyright (c) 2007 Pedro Matiello <pmatiello@gmail.com>
# License: MIT (see COPYING file)
import graph
print "Example"
# Graph creation
gr = graph.graph()
# Add nodes and edges
gr.add_nodes(["Portugal","Spain","France","Germany","Belgium","Netherlands","Italy"])
gr.add_nodes(["England","Ireland","Scotland","Wales"])
gr.add_nodes(["Atlantis","Wonderland","Brazil","Lemuria","Ys"])
gr.add_edge("Portugal", "Spain", wt=1)
gr.add_edge("Spain","France", wt=3)
gr.add_edge("France","Belgium", wt=2)
gr.add_edge("France","Germany", wt=3)
gr.add_edge("France","Italy", wt=3)
gr.add_edge("Belgium","Netherlands", wt=1)
gr.add_edge("Germany","Belgium", wt=2)
gr.add_edge("Germany","Netherlands", wt=2)
gr.add_edge("Germany","Italy", wt=3)
gr.add_edge("England","Wales", wt=1)
gr.add_edge("England","Scotland", wt=2)
gr.add_edge("Scotland","Wales", wt=1)
gr.add_edge("Atlantis","Wonderland")
gr.add_arrow("Atlantis","Brazil")
gr.add_edge("Atlantis","Lemuria")
gr.add_arrow("Lemuria","Ys")
gr.add_arrow("Ys","Wonderland")
gr.add_arrow("Ys","Brazil")
print "------------------------------------------------------------------------"
print "Breadth First Search"
print gr.breadth_first_search()
```

catalini

one problem with the script above is that it just returns the connected elements, not the distance on the graph...

Adam314

You could use the DBI module to connect to your database

http://search.cpan.org/~timb/DBI-1.604/DBI.pm

And the Graph module to create a graph

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod

And the Graph::Traversal::BFS to do a breadth-first-search

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph/Traversal/BFS.pm

Each of those modules has documentation. If you would like some sample code using those modules, let me know.

http://search.cpan.org/~timb/DBI-1.604/DBI.pm

And the Graph module to create a graph

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod

And the Graph::Traversal::BFS to do a breadth-first-search

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph/Traversal/BFS.pm

Each of those modules has documentation. If you would like some sample code using those modules, let me know.

Get an unlimited membership to EE for less than $4 a week.

Unlimited question asking, solutions, articles and more.

Adam314

Here is some sample code...

```
use DBI;
use Graph;
use Graph::Traversal::BFS;
#Connect to database
#you will need to define the $data_source, $username, and $password
my $dbh = DBI->connect($data_source, $username, $password) or die "Could not connect: $DBI::errstr\n";
#Created undirected graph
my $graph = Graph->new(undirected=>1);
#Get data from database
my $sth = $dbh->prepare("SELECT INDIVIDUAL1, INDIVIDUAL2 FROM table")
or die "Could not prepare: $DBI::errstr\n";
$sth->execute;
#create and edge according to db (a vertex is automatically created if it doesn't already exist)
while(my ($a, $b) = $sth->fetchrow_array) {
$graph->add_edge($a, $b);
}
#Create traversal object (see documentation for details)
my $b = Graph::Traversal::BFS->new($graph);
$b->bfs;
```

Log in or sign up to see answer

Become an EE member today7-DAY FREE TRIAL

Members can start a 7-Day Free trial then enjoy unlimited access to the platform

or

Learn why we charge membership fees

We get it - no one likes a content blocker. Take one extra minute and find out why we block content.

Not exactly the question you had in mind?

Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.

ask a questioncatalini

ozo:

I get a syntax error at distance.pl line 4, near "){"

I changed the connect line to

DBI->connect("dbi:mysql:database=test;host=localhost;user=root;password=########")

I get a syntax error at distance.pl line 4, near "){"

I changed the connect line to

DBI->connect("dbi:mysql:da

catalini

adam:

is there a way to save the distances between each vertex pair back into a mysql table?

is there a way to save the distances between each vertex pair back into a mysql table?

Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!

James Murphy

catalini

adam:

maybe I could use this?

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph/AdjacencyMatrix.pm

not sure how to add it to your code...

thanks!

maybe I could use this?

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph/AdjacencyMatrix.pm

not sure how to add it to your code...

thanks!

Adam314

Saving back to mysql, do you want to create a new table with every possible vertex combination, and the distance to them?

catalini

yes that would be great! I already have a table called DISTANCES with structure

INDIVIDUAL1, INDIVIDUAL2, DISTANCE

INDIVIDUAL1, INDIVIDUAL2, DISTANCE

Get an unlimited membership to EE for less than $4 a week.

Unlimited question asking, solutions, articles and more.

Get an unlimited membership to EE for less than $4 a week.

Unlimited question asking, solutions, articles and more.

catalini

adam, thanks! the procedure is almost perfect. Do you think there is a way to write only the positive distances? in this way it's writing also all the distances that are infinite (NULL)... that would probably speed things a little bit,

thanks!

thanks!

catalini

furthermore it seems to crash my system, using all available resources (memory and 4GB of swap)... :-(

catalini

thanks!

Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!

Walt Forbes

Adam314

To skip those with NULL distances, you could put this before line 28:

my $len = $apsp->path_length($a, $b);

next unless $len > 0;

As for it crashing... computing the shortest distance from each vertex to each is very memory/time consuming. I'm not sure what to do about that. One thing that might help is to compute the shortest distance from one vertex to each, and add that to the database, then release it from memory, and go to the next node. Look in the Single-Source Shortest Path section:

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod#Single-Source_Shortest_Paths_(SSSP)

my $len = $apsp->path_length($a, $b);

next unless $len > 0;

As for it crashing... computing the shortest distance from each vertex to each is very memory/time consuming. I'm not sure what to do about that. One thing that might help is to compute the shortest distance from one vertex to each, and add that to the database, then release it from memory, and go to the next node. Look in the Single-Source Shortest Path section:

http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod#Single-Source_Shortest_Paths_(SSSP)

catalini

probably SSSP_Dijkstra is the way to go...

I've opened a similar question to solve this problem, maybe you can take a look ?

https://www.experts-exchange.com/Programming/Languages/Scripting/Perl/Q_23433195.html

thanks adam

I've opened a similar question to solve this problem, maybe you can take a look ?

https://www.experts-exchange.com/Programming/Languages/Scripting/Perl/Q_23433195.html

thanks adam

http://code.google.com/p/python-graph/

but I'm not sure how to load the data....