troubleshooting Question

calculating distance on a huge graph

Avatar of catalini
catalini asked on
JavaPerl
71 Comments2 Solutions924 ViewsLast Modified:
I'm following up on my previous question

https://www.experts-exchange.com/Programming/Languages/Scripting/Perl/Q_23428010.html#a21635922

since the solution still has some problems.

I'm trying to calculate distance on a graph. I've seen plenty of implementations online, but I need one that would allow me to read data from my mysql database. Ideally the Dijkstra algorithm would do the job.

I've a table structured like this:

INDIVIDUAL1         INDIVIDUAL2
---------------------------------------
AH2223423           AS38423842
AH2223423           HS2772722
AH2223423           IASD727272
HGS282828          AS38423842

where every row corresponds to an edge (e.g. row1 means that  AH2223423 and AS38423842 are connected) on the graph and every INDIVIDUAL (e.g. AH2223423) is a vertex


The code below by adam314 (see Snippet) calculates the distances on a small graph, but fails with the tables I use (more than 200,000 edges). Is there any other efficient way to calculate the distances between connected individuals on a graph and save them into mysql?



An alternative code suggested by ozo is here below, but it does not seem to work...

#!/bin/perl
use DBD::mysql;
my  $ldbh = DBI->connect("DBI:mysql:my mysql database", username, password)
for(  $ldbh->selectall_arrayre("SELECT INDIVIDUAL1,INDIVIDUAL2 from edges") ){
  $edge{$_->[0]}{$_->[1]}++;
  $edge{$_->[1]}{$_->[0]}++;
}

#does it need to be depth first, or can you use any algorithm that finds the correct distance?
sub distance{
  my($from,$to)=@_;
  return 0 if  $from eq $to;
  my @first=($from);
  my %dist=($from=>0);
  while( my $first = shift @first ){
       $dist = $dist{$first}+1;
       return $dist if $edge{$first}{$to};
       for( grep !exists $dist{$_},keys %{$edge{$first} ){
                   $dist{$_}=$dist;        
                   push @first,$_;
        }
  }
  return undef;
}


use DBI;
use Graph;
 
#Connect to database
my $dbh = DBI->connect($data_source, $username, $password) or die "Could not connect: $DBI::errstr\n";
 
#Created undirected graph
my $graph = Graph->new(undirected=>1);
 
#Get data from database
my $sth = $dbh->prepare("SELECT INDIVIDUAL1, INDIVIDUAL2 FROM table");
$sth->execute;
 
#create and edge according to db (a vertex is automatically created if it doesn't already exist)
while(my ($a, $b) = $sth->fetchrow_array) {
        $graph->add_weighted_edge($a, $b, 1);
}
 
#Create statement to insert data
my $sth_insert = $dbh->prepare("INSERT INTO table_distances (Individual1, Individual2, Distance) VALUES (?, ?, ?);");
 
#Calculate shortest distances using the Floyd Warshall algorithm
my $apsp = $graph->APSP_Floyd_Warshall;
 
#Save data to database
foreach my $a ($graph->vertices) {
        foreach my $b ($graph->vertices) {
                $sth_insert->execute($a, $b, $apsp->path_length($a, $b));
        }
}
Join the community to see this answer!
Join our exclusive community to see this answer & millions of others.
Unlock 2 Answers and 71 Comments.
Join the Community
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 2 Answers and 71 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros