catalini
asked on
calculating distance on a huge graph
I'm following up on my previous question
https://www.experts-exchange.com/questions/23428010/Breadth-First-Search-reading-data-from-a-mysql-table.html?anchorAnswerId=21635922#a21635922
since the solution still has some problems.
I'm trying to calculate distance on a graph. I've seen plenty of implementations online, but I need one that would allow me to read data from my mysql database. Ideally the Dijkstra algorithm would do the job.
I've a table structured like this:
INDIVIDUAL1 INDIVIDUAL2
-------------------------- ---------- ---
AH2223423 AS38423842
AH2223423 HS2772722
AH2223423 IASD727272
HGS282828 AS38423842
where every row corresponds to an edge (e.g. row1 means that AH2223423 and AS38423842 are connected) on the graph and every INDIVIDUAL (e.g. AH2223423) is a vertex
The code below by adam314 (see Snippet) calculates the distances on a small graph, but fails with the tables I use (more than 200,000 edges). Is there any other efficient way to calculate the distances between connected individuals on a graph and save them into mysql?
An alternative code suggested by ozo is here below, but it does not seem to work...
#!/bin/perl
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:my mysql database", username, password)
for( $ldbh->selectall_arrayre(" SELECT INDIVIDUAL1,INDIVIDUAL2 from edges") ){
$edge{$_->[0]}{$_->[1]}++;
$edge{$_->[1]}{$_->[0]}++;
}
#does it need to be depth first, or can you use any algorithm that finds the correct distance?
sub distance{
my($from,$to)=@_;
return 0 if $from eq $to;
my @first=($from);
my %dist=($from=>0);
while( my $first = shift @first ){
$dist = $dist{$first}+1;
return $dist if $edge{$first}{$to};
for( grep !exists $dist{$_},keys %{$edge{$first} ){
$dist{$_}=$dist;
push @first,$_;
}
}
return undef;
}
https://www.experts-exchange.com/questions/23428010/Breadth-First-Search-reading-data-from-a-mysql-table.html?anchorAnswerId=21635922#a21635922
since the solution still has some problems.
I'm trying to calculate distance on a graph. I've seen plenty of implementations online, but I need one that would allow me to read data from my mysql database. Ideally the Dijkstra algorithm would do the job.
I've a table structured like this:
INDIVIDUAL1 INDIVIDUAL2
--------------------------
AH2223423 AS38423842
AH2223423 HS2772722
AH2223423 IASD727272
HGS282828 AS38423842
where every row corresponds to an edge (e.g. row1 means that AH2223423 and AS38423842 are connected) on the graph and every INDIVIDUAL (e.g. AH2223423) is a vertex
The code below by adam314 (see Snippet) calculates the distances on a small graph, but fails with the tables I use (more than 200,000 edges). Is there any other efficient way to calculate the distances between connected individuals on a graph and save them into mysql?
An alternative code suggested by ozo is here below, but it does not seem to work...
#!/bin/perl
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:my
for( $ldbh->selectall_arrayre("
$edge{$_->[0]}{$_->[1]}++;
$edge{$_->[1]}{$_->[0]}++;
}
#does it need to be depth first, or can you use any algorithm that finds the correct distance?
sub distance{
my($from,$to)=@_;
return 0 if $from eq $to;
my @first=($from);
my %dist=($from=>0);
while( my $first = shift @first ){
$dist = $dist{$first}+1;
return $dist if $edge{$first}{$to};
for( grep !exists $dist{$_},keys %{$edge{$first} ){
$dist{$_}=$dist;
push @first,$_;
}
}
return undef;
}
use DBI;
use Graph;
#Connect to database
my $dbh = DBI->connect($data_source, $username, $password) or die "Could not connect: $DBI::errstr\n";
#Created undirected graph
my $graph = Graph->new(undirected=>1);
#Get data from database
my $sth = $dbh->prepare("SELECT INDIVIDUAL1, INDIVIDUAL2 FROM table");
$sth->execute;
#create and edge according to db (a vertex is automatically created if it doesn't already exist)
while(my ($a, $b) = $sth->fetchrow_array) {
$graph->add_weighted_edge($a, $b, 1);
}
#Create statement to insert data
my $sth_insert = $dbh->prepare("INSERT INTO table_distances (Individual1, Individual2, Distance) VALUES (?, ?, ?);");
#Calculate shortest distances using the Floyd Warshall algorithm
my $apsp = $graph->APSP_Floyd_Warshall;
#Save data to database
foreach my $a ($graph->vertices) {
foreach my $b ($graph->vertices) {
$sth_insert->execute($a, $b, $apsp->path_length($a, $b));
}
}
In what way does it fail, any error messages?
ASKER
I get a syntax error at distance.pl line 4, near "){"
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:my mysql database", username, password); # ; was missing
for( $ldbh->selectall_arrayre(" SELECT INDIVIDUAL1,INDIVIDUAL2 from edges") ){
$edge{$_->[0]}{$_->[1]}++;
$edge{$_->[1]}{$_->[0]}++;
}
#This uses meet in the middle rather than depth first,
sub distance{
my($from,$to)=@_;
return 0 if $from eq $to;
my %df=($from=>1);
my %dr=($to=>1);
my $d=0;
my $k=0;
while( !$k ){
$k=1;
++$d;
my($f,$r);
if( keys %df <= keys %dr ){
($f,$r) = (\%df,\%dr);
}else{
($f,$r) = (\%dr,\%df);
}
for( keys %$f ){
for( keys %{$edge{$_}} ){
$k =0 if !$f->{$_}++;
return $d if $r->{$_};
}
}
}
return undef;
}
my $ldbh = DBI->connect("DBI:mysql:my
for( $ldbh->selectall_arrayre("
$edge{$_->[0]}{$_->[1]}++;
$edge{$_->[1]}{$_->[0]}++;
}
#This uses meet in the middle rather than depth first,
sub distance{
my($from,$to)=@_;
return 0 if $from eq $to;
my %df=($from=>1);
my %dr=($to=>1);
my $d=0;
my $k=0;
while( !$k ){
$k=1;
++$d;
my($f,$r);
if( keys %df <= keys %dr ){
($f,$r) = (\%df,\%dr);
}else{
($f,$r) = (\%dr,\%df);
}
for( keys %$f ){
for( keys %{$edge{$_}} ){
$k =0 if !$f->{$_}++;
return $d if $r->{$_};
}
}
}
return undef;
}
ASKER
ozo, thanks. It seems to work. But how can i write the calculated distances in my mysql table (structured as INDIVIDUAL1, INDIVIDUAL2, DISTANCE)?
thanks
thanks
Do you want to find all distances, rather than just a single distance?
ASKER
yes :-(
#untested
my $count=1;
while( $count ){
last unless my $dbh = $ldbh-prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t1.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)") and $dbh->execute();
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
++$count;
$ldbj->do("insert into edges set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2];");
}
}
my $count=1;
while( $count ){
last unless my $dbh = $ldbh-prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t1.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)")
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
++$count;
$ldbj->do("insert into edges set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2];");
}
}
ASKER
it returns
Undefined subroutine &main::prepare called at distance.pl line 42.
line 42 is:
last unless my $dbh = $ldbh-prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t1.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)") and $dbh->execute();
Undefined subroutine &main::prepare called at distance.pl line 42.
line 42 is:
last unless my $dbh = $ldbh-prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t1.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)")
sorry, try $ldbh->prepare
ASKER
now i get
Can't call method "execute" on an undefined value at distance.pl line 42.
thanks ozo for helping me out on this :-)
Can't call method "execute" on an undefined value at distance.pl line 42.
thanks ozo for helping me out on this :-)
#sorry, I would have caught those myself if I had DBD::mysql on this machine
#I also noticed a typo of t1 where I meant t2
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)") and $dbh->execute();
#I also noticed a typo of t1 where I meant t2
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)")
ASKER
DBD::mysql::st execute failed: Unknown column 't1.DISTANCE' in 'field list' at distance.pl line 42.
actually i do not have any field DISTANCE in the originating table (it's just INDIVIDUAL1, INDIVIDUAL2)...
actually i do not have any field DISTANCE in the originating table (it's just INDIVIDUAL1, INDIVIDUAL2)...
#untested
my $count=1;
while( $count ){
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)") and $dbh->execute();
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
++$count;
$ldbj->do("insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2];");
}
}
then what was the mysql table (structured as INDIVIDUAL1, INDIVIDUAL2, DISTANCE)?
Can you create a new table with DISTANCE=1 when (INDIVIDUAL1, INDIVIDUAL2) exists in the original table?
Can you create a new table with DISTANCE=1 when (INDIVIDUAL1, INDIVIDUAL2) exists in the original table?
ASKER
the INDIVIDUAL1, INDIVIDUAL2, DISTANCE is the final table I would like to save the distances into,
I've created a column called DISTANCE with value = 1 as you suggested, the code seems to be running now. I'll let you know asap if it works! thanks for now!
I've created a column called DISTANCE with value = 1 as you suggested, the code seems to be running now. I'll let you know asap if it works! thanks for now!
ASKER
Can't call method "do" on an undefined value at distance.pl line 47.
line 47 is
$ldbj->do("insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2];");
line 47 is
$ldbj->do("insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2];");
$ldbj should be the $ldbh from $ldbh = DBI->connect
It might be best to debug it on a small graph so you can watch that all its steps make sense before trying it on a huge graph
ASKER
now it's writing on the table DISTANCES but it gives all the individuals distance = 2 ?
maybe because of t1.DISTANCE+t2.DISTANCE?
maybe because of t1.DISTANCE+t2.DISTANCE?
ASKER
I'm trying it on a 6000 nodes table
ASKER
it has already written over 7,000,000 rows?!?!
Are there nodes which should have distance 3 or NULL?
What was the step that set that distance to 2?
A connected graph of 6000 nodes could have 36000000 distances
What was the step that set that distance to 2?
A connected graph of 6000 nodes could have 36000000 distances
ASKER
yes distances should go from NULL to 7, 8 at least...
it's probably because in the table edges all the DISTANCE=1 and when the sql joins them it sums 1+1=2?
just guessing....
it's probably because in the table edges all the DISTANCE=1 and when the sql joins them it sums 1+1=2?
just guessing....
It should only add an edge with t1.INDIVIDUAL1 to t2.INDIVIDUAL2 at distance 2 when there is an edge from t1.INDIVIDUAL1 to t1.INDIVIDUAL2 at distance 1
and an edge from t2.INDIVIDUAL1 to t2.INDIVIDUAL2 at distance 1
with t1.INDIVIDUAL2 = t2.INDIVIDUAL1 which means that there is a path of length two between them is it adding such an edge when there is no such path?
and an edge from t2.INDIVIDUAL1 to t2.INDIVIDUAL2 at distance 1
with t1.INDIVIDUAL2 = t2.INDIVIDUAL1 which means that there is a path of length two between them is it adding such an edge when there is no such path?
I think the where condition should probably have been
not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2)
or exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance > t1.DISTANCE+t2.DISTANCE)
not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2)
or exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance > t1.DISTANCE+t2.DISTANCE)
No, that wouldn't explain the symptoms.
Could you print out the edge that it wrongly added, and what what rows it selected for t1, t2 and t3 that made it think that edge should be added?
Could you print out the edge that it wrongly added, and what what rows it selected for t1, t2 and t3 that made it think that edge should be added?
I noticed another typo
insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2]
should have been
insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=$r->[2]
but that doesn't explain the symptoms, since that would only add an edge from a node to itself.
insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2]
should have been
insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=$r->[2]
but that doesn't explain the symptoms, since that would only add an edge from a node to itself.
ASKER
the nodes to itself where already present, it made a huge table (over 350,000,000) with all possible pairs and all the distances = 2
my starting table (edges) is something like this
INDIVIDUAL1, INDIVIDUAL2, DISTANCE
A8282 B2828 1
A8282 A7272 1
A8282 B2222 1
A7272 H1234 1
when there is an edge from A8282 to B2828 then there is also an edge between B2828 and A8282 (i.e. network is undirected) and their distance is 1.
From above it should return that
A8282 is at distance 1 from B2828, A7272 and B2222
A8282 is at distance 2 from H1234
H1234 is at distance 3 from B2828 and so on....
:-)
my starting table (edges) is something like this
INDIVIDUAL1, INDIVIDUAL2, DISTANCE
A8282 B2828 1
A8282 A7272 1
A8282 B2222 1
A7272 H1234 1
when there is an edge from A8282 to B2828 then there is also an edge between B2828 and A8282 (i.e. network is undirected) and their distance is 1.
From above it should return that
A8282 is at distance 1 from B2828, A7272 and B2222
A8282 is at distance 2 from H1234
H1234 is at distance 3 from B2828 and so on....
:-)
ASKER
my code ozo looks like this
#!/bin/perl
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:database=test;user=XXXXX;password=XXXXXXXX") or die "Could not connect: $DBI::errstr\n";
for( $ldbh->selectall_arrayref("SELECT INDIVIDUAL1, INDIVIDUAL2 from edges") ){
$edge{$_->[0]}{$_->[1]}++;
$edge{$_->[1]}{$_->[0]}++;
}
#This uses meet in the middle rather than depth first,
sub distance{
my($from,$to)=@_;
return 0 if $from eq $to;
my %df=($from=>1);
my %dr=($to=>1);
my $d=0;
my $k=0;
while( !$k ){
$k=1;
++$d;
my($f,$r);
if( keys %df <= keys %dr ){
($f,$r) = (\%df,\%dr);
}else{
($f,$r) = (\%dr,\%df);
}
for( keys %$f ){
for( keys %{$edge{$_}} ){
$k =0 if !$f->{$_}++;
return $d if $r->{$_};
}
}
}
return undef;
}
#untested
my $count=1;
while( $count ){
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 or exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance > t1.DISTANCE+t2.DISTANCE)") and $dbh->execute();
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
++$count;
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=$r->[2];");
}
}
I think I will try to load DBD::mysql on this machine so I can test it.
in the mean time, if you start with these four rows in your table:
A8282 B2828 1
A8282 A7272 1
A8282 B2222 1
A7272 H1234 1
can you find out exactly what rows were selected when it tries to insert
H1234 B2828 2
in the mean time, if you start with these four rows in your table:
A8282 B2828 1
A8282 A7272 1
A8282 B2222 1
A7272 H1234 1
can you find out exactly what rows were selected when it tries to insert
H1234 B2828 2
ASKER
I run the code exactly with your table (4 rows)
and it started writing without an end this row
A8282 H1234 2
into the table DISTANCES...
it created more than 387,000 records in a few seconds! :-))
and it started writing without an end this row
A8282 H1234 2
into the table DISTANCES...
it created more than 387,000 records in a few seconds! :-))
I'm not sure how this compares time/memory wise to what ozo has, but an other option would be to have the Graph module calculate the distances. Calculating them all at once with the APSP_Floyd_Warshall method is to slow/time consuming on a large graph (based on your other question). But it should be easier to calculate one set of nodes at a time. I'm also not sure how the cache'ing is done. It might help to clear the cache every so often.
Replace lines 22-30 from your original script with this
Replace lines 22-30 from your original script with this
#Calculate shortest distances
foreach my $a ($graph->vertices) {
foreach my $b ($graph->vertices) {
next if $a eq $b; #skip from-to same node
my @nodes = $graph->SP_Dijkstra($a, $b);
next unless @nodes; #skip when $a to $b is not reachable
$sth_insert->execute($a, $b, $#nodes);
}
#clear cache, uncomment to enable clearing
#$graph->SPT_Dijkstra_clear_cache;
}
ASKER
adam:
i get "2Not an ARRAY reference at /usr/share/perl5/Heap/Elem .pm line 31."
and the lines it refers to in the Elem.pm file are below
i get "2Not an ARRAY reference at /usr/share/perl5/Heap/Elem
and the lines it refers to in the Elem.pm file are below
# get or set heap slot
sub heap {
@_ > 1 ? ($_[0][1] = $_[1]) : $_[0][1]; #THIS IS LINE 31
}
A8282 H1234 2
is correct because of
t1=
A8282 A7272 1
t2=
A7272 H1234 1
but the
not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)
should have prevented it from adding it again because
A8282 H1234 2
would exist as t3
is correct because of
t1=
A8282 A7272 1
t2=
A7272 H1234 1
but the
not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)
should have prevented it from adding it again because
A8282 H1234 2
would exist as t3
ASKER
ozo:
this is the query I have... and it still generates all the duplicates forever...
"SELECT t1.INDIVIDUAL1, t2.NODE, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.NODE = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.NODE = t2.NODE AND t3.distance > t1.DISTANCE+t2.DISTANCE)"
this is the query I have... and it still generates all the duplicates forever...
"SELECT t1.INDIVIDUAL1, t2.NODE, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.NODE = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.NODE = t2.NODE AND t3.distance > t1.DISTANCE+t2.DISTANCE)"
I think the > should be <=
and is NODE another name for INDIVIDUAL2?
and is NODE another name for INDIVIDUAL2?
ASKER
yes, node is the same as INDIVIDUAL2
I've changed the ">" to "<= "
but i still get a table full of rows like this....
A8282 H1234 2
I've changed the ">" to "<= "
but i still get a table full of rows like this....
A8282 H1234 2
If you don't want to continue trying with the Graph module, let me know... and I'll step back...
Which line from the code in post 21653805 is causing the error?
Which line from the code in post 21653805 is causing the error?
ASKER
adam, the graph module could be just fine, I'm trying this code
use DBI;
use Graph;
#Connect to database
my $dbh = DBI->connect("DBI:mysql:database=test;user=XXXXXX;password=XXXXX") or die "Could not connect: $DBI::errstr\n";
#Created undirected graph
my $graph = Graph->new(undirected=>1);
#Get data from database
my $sth = $dbh->prepare("SELECT INDIVIDUAL1, INDIVIDUAL2 FROM edges");
$sth->execute;
#create and edge according to db (a vertex is automatically created if it doesn't already exist)
while(my ($a, $b) = $sth->fetchrow_array) {
$graph->add_weighted_edge($a, $b, 1);
}
#Create statement to insert data
my $sth_insert = $dbh->prepare("INSERT INTO DISTANCES (INDIVIDUAL1, INDIVIDUAL2, DISTANCE) VALUES (?, ?, ?);");
#Calculate shortest distances
foreach my $a ($graph->vertices) {
foreach my $b ($graph->vertices) {
next if $a eq $b; #skip from-to same node
my @nodes = $graph->SP_Dijkstra($a, $b);
next unless @nodes; #skip when $a to $b is not reachable
$sth_insert->execute($a, $b, $#nodes);
}
#clear cache, uncomment to enable clearing
$graph->SPT_Dijkstra_clear_cache;
}
ASKER
adam:
it does not tell me on which line the error is... it just returns...
Not an ARRAY reference at /usr/share/perl5/Heap/Elem .pm line 31.
it does not tell me on which line the error is... it just returns...
Not an ARRAY reference at /usr/share/perl5/Heap/Elem
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
adam;
exactly the same error as before
Not an ARRAY reference at /usr/share/perl5/Heap/Elem .pm line 31.
exactly the same error as before
Not an ARRAY reference at /usr/share/perl5/Heap/Elem
ASKER
adam:
after a google search it could be a bug with Heap and debian based system... not sure about it...
after a google search it could be a bug with Heap and debian based system... not sure about it...
"SELECT t1.INDIVIDUAL1, t2.NODE, t1.DISTANCE+t2.DISTANCE from edges as t1, edges as t2 where t1.NODE = t2.INDIVIDUAL1 and not exists (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.NODE = t2.NODE AND t3.distance <= t1.DISTANCE+t2.DISTANCE)"
"insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2];"
you are selecting on edges but inserting into DISTANCES
that,s probably why the (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.NODE = t2.NODE AND t3.distance <= t1.DISTANCE+t2.DISTANCE)"
is not seeing the edge that was just inserted
"insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[0]', DISTANCE=$r->[2];"
you are selecting on edges but inserting into DISTANCES
that,s probably why the (select * from edges as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.NODE = t2.NODE AND t3.distance <= t1.DISTANCE+t2.DISTANCE)"
is not seeing the edge that was just inserted
ASKER
adam:
I've installed Heap from cpan directly... I'll try your code right now
I've installed Heap from cpan directly... I'll try your code right now
ASKER
ozo:
so should I write directly into edges? I just didn't want to risk damaging the real data...
so should I write directly into edges? I just didn't want to risk damaging the real data...
Can there be two distances between a pair INDIVIDUALs or NODEs?
if not, you may want to make INDIVIDUAL1, INDIVIDUAL2 or INDIVIDUAL1 NODE unique
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:da tabase=tes t;user=XXX XX;passwor d=XXXXXXXX ") or die "Could not connect: $DBI::errstr\n";
#copy edge data to DISTANCES
for( $ldbh->selectall_arrayref( "SELECT INDIVIDUAL1, INDIVIDUAL2 from edges") ){
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=1;");
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[1]', INDIVIDUAL2='$r->[0]', DISTANCE=1;");
}
#form transitive closure
my $count=1;
while( $count ){
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from DISTANCES as t1, DISTANCES as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from DISTANCES as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)") and $dbh->execute();
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
$count += $ldbh->do("update DISTANCES set DISTANCE=$r->[2] where INDIVIDUAL1='$r->[0]' and INDIVIDUAL2='$r->[1]")
|| $ldbh->do("insert DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=$r->[2];");
}
}
if not, you may want to make INDIVIDUAL1, INDIVIDUAL2 or INDIVIDUAL1 NODE unique
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:da
#copy edge data to DISTANCES
for( $ldbh->selectall_arrayref(
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=1;");
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[1]', INDIVIDUAL2='$r->[0]', DISTANCE=1;");
}
#form transitive closure
my $count=1;
while( $count ){
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from DISTANCES as t1, DISTANCES as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from DISTANCES as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.distance <= t1.DISTANCE+t2.DISTANCE)")
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
$count += $ldbh->do("update DISTANCES set DISTANCE=$r->[2] where INDIVIDUAL1='$r->[0]' and INDIVIDUAL2='$r->[1]")
|| $ldbh->do("insert DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=$r->[2];");
}
}
ASKER
ozo:
running the above script now returns only two rows (always using the test table you suggested), where INDIVIDUAL1 and INDIVIDUAL2 are blank and DISTANCE is set to 1...
running the above script now returns only two rows (always using the test table you suggested), where INDIVIDUAL1 and INDIVIDUAL2 are blank and DISTANCE is set to 1...
ASKER
ozo:
actually it returns 5 blank rows...
actually it returns 5 blank rows...
Sorry I pasted together pieces from two programs without adjusting the variable names to match
copy edge data to DISTANCES should have been
for $r ( $ldbh->selectall_arrayref( "SELECT INDIVIDUAL1, INDIVIDUAL2 from edges") ){
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=1;");
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[1]', INDIVIDUAL2='$r->[0]', DISTANCE=1;");
}
copy edge data to DISTANCES should have been
for $r ( $ldbh->selectall_arrayref(
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=1;");
$ldbh->do("insert into DISTANCES set INDIVIDUAL1='$r->[1]', INDIVIDUAL2='$r->[0]', DISTANCE=1;");
}
ASKER
ozo:
probably almost there...
it wrote two lines
INDIVIDUAL1 INDIVIDUAL2 DISTANCE
ARRAY(0x715 ARRAY(0x715 1
probably almost there...
it wrote two lines
INDIVIDUAL1 INDIVIDUAL2 DISTANCE
ARRAY(0x715 ARRAY(0x715 1
for $r ( @{$ldbh->selectall_arrayre f("SELECT INDIVIDUAL1, INDIVIDUAL2 from edges")} ){
#this may waste memory if the edge table is very large.
#this may waste memory if the edge table is very large.
ASKER
ozo:
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A8282' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A7272' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A7272' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A7272' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A8282' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A8282' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A8282' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A7272' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
my line 21 is below
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A8282' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A7272' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A7272' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A7272' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A8282' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A8282' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A8282' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''A7272' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2828' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''B2222' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''H1234' at line 1 at bozo.pl line 21.
my line 21 is below
$count += $ldbh->do("update DISTANCES2 set DISTANCE=$r->[2] where INDIVIDUAL1='$r->[0]' and INDIVIDUAL2='$r->[1]")
ASKER
ok there was a missing '
but now i get strange distances...
A8282 A8282 2
H1234 A7272 1
A7272 H1234 1
A8282 A8282 2
B2222 A8282 1
A8282 B2222 1
A7272 A7272 2
A7272 A8282 1
A8282 A7272 1
B2828 A8282 1
A8282 B2828 1
H1234 B2828 3
H1234 B2222 3
B2828 H1234 3
B2222 H1234 3
A8282 B2828 1
B2828 A8282 1
A8282 A7272 1
A7272 A8282 1
A8282 B2222 1
B2222 A8282 1
A7272 H1234 1
H1234 A7272 1
original table was
INDIVIDUAL1 INDIVIDUAL2 DISTANCE
A8282 B2828 1
A8282 A7272 1
A8282 B2222 1
A7272 H1234 1
but now i get strange distances...
A8282 A8282 2
H1234 A7272 1
A7272 H1234 1
A8282 A8282 2
B2222 A8282 1
A8282 B2222 1
A7272 A7272 2
A7272 A8282 1
A8282 A7272 1
B2828 A8282 1
A8282 B2828 1
H1234 B2828 3
H1234 B2222 3
B2828 H1234 3
B2222 H1234 3
A8282 B2828 1
B2828 A8282 1
A8282 A7272 1
A7272 A8282 1
A8282 B2222 1
B2222 A8282 1
A7272 H1234 1
H1234 A7272 1
original table was
INDIVIDUAL1 INDIVIDUAL2 DISTANCE
A8282 B2828 1
A8282 A7272 1
A8282 B2222 1
A7272 H1234 1
ASKER
the distances are actually right, just the node with itself is useless and seems to do redundant calculation... (I'll try with a unique index as you suggested)
Is there a missing '
"update DISTANCES set DISTANCE=$r->[2] where INDIVIDUAL1='$r->[0]' and INDIVIDUAL2='$r->[1]'"
it would probably be better to use ? and bind, we can tidy things up once its wotking
"update DISTANCES set DISTANCE=$r->[2] where INDIVIDUAL1='$r->[0]' and INDIVIDUAL2='$r->[1]'"
it would probably be better to use ? and bind, we can tidy things up once its wotking
is
A8282 A8282 2
one of the strange distances
it probably comes from e.g.
A8282 B2222 1
B2222 A8282 1
We could set all distances from a node to itself as 0
Or we could let that be understood and not bother entering it into the table when t1.INDIVIDUAL1 = t2.INDIVIDUAL2
depending on how you want your table to work
A8282 A8282 2
one of the strange distances
it probably comes from e.g.
A8282 B2222 1
B2222 A8282 1
We could set all distances from a node to itself as 0
Or we could let that be understood and not bother entering it into the table when t1.INDIVIDUAL1 = t2.INDIVIDUAL2
depending on how you want your table to work
ASKER
we can just ignore them!
I've added a unique index to avoid duplicates... but now it only records the first set of distances... (i.e. always 1)
ALTER TABLE `DISTANCES` ADD UNIQUE `COMPOSITE` ( `DISTANCE` , `INDIVIDUAL1` , `INDIVIDUAL2` )
A7272 A8282 1
A7272 H1234 1
A8282 A7272 1
A8282 B2222 1
A8282 B2828 1
B2222 A8282 1
B2828 A8282 1
H1234 A7272 1
I've added a unique index to avoid duplicates... but now it only records the first set of distances... (i.e. always 1)
ALTER TABLE `DISTANCES` ADD UNIQUE `COMPOSITE` ( `DISTANCE` , `INDIVIDUAL1` , `INDIVIDUAL2` )
A7272 A8282 1
A7272 H1234 1
A8282 A7272 1
A8282 B2222 1
A8282 B2828 1
B2222 A8282 1
B2828 A8282 1
H1234 A7272 1
I think it should be (`INDIVIDUAL1` , `INDIVIDUAL2` )
ASKER
ozo:
I don't know what's going on but now I only get 7 DISTANCE 1 ROWS
this is the code I'm using...
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:da tabase=tes t;user=XXX X;password =XXXXXXXX" ) or die "Could not connect: $DBI::errstr\n";
#copy edge data to DISTANCES2
for $r ( @{$ldbh->selectall_arrayre f("SELECT INDIVIDUAL1, INDIVIDUAL2 from edges2")} ){
#this may waste memory if the edge table is very large.
$ldbh->do("insert into DISTANCES2 set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=1;");
$ldbh->do("insert into DISTANCES2 set INDIVIDUAL1='$r->[1]', INDIVIDUAL2='$r->[0]', DISTANCE=1;");
}
#form transitive closure
my $count=1;
while( $count ){
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from DISTANCES2 as t1, DISTANCES2 as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from DISTANCES2 as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.DISTANCE <= t1.DISTANCE+t2.DISTANCE)") and $dbh->execute();
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
$count += $ldbh->do("update DISTANCES2 set DISTANCE=$r->[2] where INDIVIDUAL1='$r->[0]' and INDIVIDUAL2='$r->[1]'")
|| $ldbh->do("insert DISTANCES2 set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=$r->[2];");
}
}
and this is the result
H1234 A7272 1
A7272 H1234 1
B2222 A8282 1
A8282 B2222 1
A7272 A8282 1
A8282 A7272 1
B2828 A8282 1
A8282 B2828 1
this is really tricky! :-)
I don't know what's going on but now I only get 7 DISTANCE 1 ROWS
this is the code I'm using...
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:da
#copy edge data to DISTANCES2
for $r ( @{$ldbh->selectall_arrayre
#this may waste memory if the edge table is very large.
$ldbh->do("insert into DISTANCES2 set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=1;");
$ldbh->do("insert into DISTANCES2 set INDIVIDUAL1='$r->[1]', INDIVIDUAL2='$r->[0]', DISTANCE=1;");
}
#form transitive closure
my $count=1;
while( $count ){
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from DISTANCES2 as t1, DISTANCES2 as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from DISTANCES2 as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.DISTANCE <= t1.DISTANCE+t2.DISTANCE)")
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
$count += $ldbh->do("update DISTANCES2 set DISTANCE=$r->[2] where INDIVIDUAL1='$r->[0]' and INDIVIDUAL2='$r->[1]'")
|| $ldbh->do("insert DISTANCES2 set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=$r->[2];");
}
}
and this is the result
H1234 A7272 1
A7272 H1234 1
B2222 A8282 1
A8282 B2222 1
A7272 A8282 1
A8282 A7272 1
B2828 A8282 1
A8282 B2828 1
this is really tricky! :-)
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
To avoid distances from a node to itself, you can add
AND t1.INDIVIDUAL1 != t2.INDIVIDUAL2
to the where clause
AND t1.INDIVIDUAL1 != t2.INDIVIDUAL2
to the where clause
ASKER
using your debug code it worked :-)
A8282 B2828 1
B2828 A8282 1
A8282 A7272 1
A7272 A8282 1
A8282 B2222 1
B2222 A8282 1
A7272 H1234 1
H1234 A7272 1
B2828 B2828 2
A7272 B2828 2
B2222 B2828 2
A8282 A8282 2
B2828 A7272 2
A7272 A7272 2
B2222 A7272 2
H1234 A8282 2
B2828 B2222 2
A7272 B2222 2
B2222 B2222 2
A8282 H1234 2
H1234 H1234 2
H1234 B2828 3
H1234 B2222 3
B2828 H1234 3
B2222 H1234 3
I've tryed adding the code AND t1.INDIVIDUAL1 !=... but it started and infinite loop...
A8282 B2828 1
B2828 A8282 1
A8282 A7272 1
A7272 A8282 1
A8282 B2222 1
B2222 A8282 1
A7272 H1234 1
H1234 A7272 1
B2828 B2828 2
A7272 B2828 2
B2222 B2828 2
A8282 A8282 2
B2828 A7272 2
A7272 A7272 2
B2222 A7272 2
H1234 A8282 2
B2828 B2222 2
A7272 B2222 2
B2222 B2222 2
A8282 H1234 2
H1234 H1234 2
H1234 B2828 3
H1234 B2222 3
B2828 H1234 3
B2222 H1234 3
I've tryed adding the code AND t1.INDIVIDUAL1 !=... but it started and infinite loop...
use DBD::mysql;
my $ldbh = DBI->connect("DBI:mysql:database=test;user=XXXX;password=XXXXXX") or die "Could not connect: $DBI::errstr\n";
#copy edge data to DISTANCES2
for $r ( @{$ldbh->selectall_arrayref("SELECT INDIVIDUAL1, INDIVIDUAL2 from edges2")} ){
#this may waste memory if the edge table is very large.
$ldbh->do("insert into DISTANCES2 set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=1;");
$ldbh->do("insert into DISTANCES2 set INDIVIDUAL1='$r->[1]', INDIVIDUAL2='$r->[0]', DISTANCE=1;");
}
#form transitive closure
my $count=1;
while( $count ){
my $dbh;
last unless $dbh = $ldbh->prepare("SELECT t1.INDIVIDUAL1, t2.INDIVIDUAL2, t1.DISTANCE+t2.DISTANCE from DISTANCES2 as t1, DISTANCES2 as t2 where t1.INDIVIDUAL2 = t2.INDIVIDUAL1 and not exists (select * from DISTANCES2 as t3 where t3.INDIVIDUAL1 = t1.INDIVIDUAL1 and t3.INDIVIDUAL2 = t2.INDIVIDUAL2 AND t3.DISTANCE <= t1.DISTANCE+t2.DISTANCE)") and $dbh->execute();
$count=0;
while( my $r = $dbh->fetchrow_arrayref() ){
print "fetchrow @$r\n";
print "update: ",($d=$ldbh->do("update DISTANCES2 set DISTANCE=$r->[2] where INDIVIDUAL1='$r->[0]' and INDIVIDUAL2='$r->[1]'")),"\n";
if( $d <= 0 ){
print "insert",($d=$ldbh->do("insert into DISTANCES2 set INDIVIDUAL1='$r->[0]', INDIVIDUAL2='$r->[1]', DISTANCE=$r->[2];")),"\n";
}
$count += $d;
}
}
ASKER
I've tried on a bigger table, it only returns some of the distances and all between 1 and 2...
:-(
:-(
How did you install the Graph module? I just tried it on a ubuntu system, which is debian based, and it worked fine.
I installed the Graph module:
as root: perl -MCPAN -e 'install Graph'
I installed the Graph module:
as root: perl -MCPAN -e 'install Graph'
ASKER
adam:
it works now, the procedure is running...
do you think it should be faster with this enabled or not?
it works now, the procedure is running...
do you think it should be faster with this enabled or not?
$graph->SPT_Dijkstra_clear_cache;
I'm not sure. I don't know how the Graph module works internally, or what is cached. I could test it on my computer, but I think it'll depend more on your computer, how much memory, and the size of the graph.
If you wanted, you could have it print out how long it took after every node. If you notice it getting slower as it goes on, then I think that clearing the cache will help. If it stays about the same speed, then I'm guessing that clearing the cache would not help.
I also don't know how it will perform relative to the solution ozo was working on.
If you wanted, you could have it print out how long it took after every node. If you notice it getting slower as it goes on, then I think that clearing the cache will help. If it stays about the same speed, then I'm guessing that clearing the cache would not help.
I also don't know how it will perform relative to the solution ozo was working on.
ASKER
I'm running it right now... how could I add the debug time information? thanks!
Ozo solution could be fast, but does not work at the moment.
Thanks to both of you anyways!
Ozo solution could be fast, but does not work at the moment.
Thanks to both of you anyways!
Change the 100 in
if($count==100)
to the number of nodes you want to process before checking the time
if($count==100)
to the number of nodes you want to process before checking the time
use Time::HiRes 'time';
#Calculate shortest distances
my $count=0;
my $time_a=time;
foreach my $a ($graph->vertices) {
foreach my $b ($graph->vertices) {
next if $a eq $b; #skip from-to same node
my @nodes = $graph->SP_Dijkstra($a, $b);
$count++;
if($count==100){
$count=0;
my $time_b=time;
printf "Elapsed: %f\n", $time_b-$time_a;
$time_a = $time_b;
}
next unless @nodes; #skip when $a to $b is not reachable
printf "Path %2s - %2s: %d\n", $a, $b, $#nodes;
}
#clear cache, uncomment to enable clearing
$graph->SPT_Dijkstra_clear_cache;
}
were the distances and all between 1 and 2 correct?
If there was a distance 3 missing, what was the path that should have produced a distance of 3?
If there was a distance 3 missing, what was the path that should have produced a distance of 3?
ASKER
I couldn't check it directly, since I've started with a 6000 rows table. But distances should be between 1 and 9 at least for sure :-(
ASKER
thanks for all your time on this question!