Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Synchronize duplicate fields in two different MYSQL databases using a PHP script

Posted on 2008-11-11
16
Medium Priority
?
1,018 Views
Last Modified: 2013-12-13
I need to synchronize duplicate fields in two different MYSQL databases using a PHP script

The script needs to login to both databases using separate credentials, and then the query needs to compare the fields in DB1 to those in DB2.
Two options need exist, with separate submit buttons :
1) update DB2 to match the associated fields in DB1
2) ADD any new records to DB2, that exist in DB1

DB1

TABLE_d1_a
Field_d1_a1 ID PK (auto increment)
Field_d1_a2
Field_d1_a3
Field_d1_a4

TABLE_d1_b
Field_d1_b1
Field_d1_b2 ID FK
Field_d1_b3
Field_d1_b4

TABLE_d1_c
Field_d1_c1
Field_d1_c2 ID FK
Field_d1_c3
Field_d1_c4

TABLE_d1_d
Field_d1_d1
Field_d1_d2 ID FK
Field_d1_d3
Field_d1_d4

DB2

TABLE_d2_a
Field_d2_a1
Field_d2_a2 ID FK
Field_d2_a3

TABLE_d2_b
Field_d2_b1
Field_d2_b2 ID FK
Field_d2_b3
Field_d2_b4

TABLE_d2_d
Field_d2_d1 ID FK
Field_d2_d3
Field_d2_d4
0
Comment
Question by:ggjones
  • 8
  • 7
16 Comments
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22936179
I see there is a primary key (PK) on table TABLE_d1_a. Are there no primary keys or unique indexes on any of the other tables? If not, how to differentiate between an update and a new record?

Which database is local, and which is remote? Or should this run from a third server, so that both databases are remote? Or is it two local databases?

How many rows (approximately) are there in each table on the remote server(s)?

What are the datatypes for each of the fields? A real database schema would be usefull...

I assume Field_d1_a1 is mapped to Field_d2_a1, Field_d1_a2 is mapped to Field_d2_a2 and so on, is this correct?

How is the table TABLE_d1_c involved?

How are the fields Field_d1_a4 and Field_d1_d2 involved?
0
 

Author Comment

by:ggjones
ID: 22936280
... thanks for responding. To address your queries;

I see there is a primary key (PK) on table TABLE_d1_a. Are there no primary keys or unique indexes on any of the other tables? If not, how to differentiate between an update and a new record?

>> my apologies. Assume the first field of each table is a PK (auto increment)

Which database is local, and which is remote? Or should this run from a third server, so that both databases are remote? Or is it two local databases?

>> both DBs are remote

How many rows (approximately) are there in each table on the remote server(s)?

>> assume less than 10k

What are the datatypes for each of the fields? A real database schema would be usefull...

>> assume varchar

I assume Field_d1_a1 is mapped to Field_d2_a1, Field_d1_a2 is mapped to Field_d2_a2 and so on, is this correct?

>> yes, correct

How is the table TABLE_d1_c involved?

>> for illustrative purposes. i.e. some tables in DB1 do NOT exist in DB2

How are the fields Field_d1_a4 and Field_d1_d2 involved?

>> for illustrative purposes. i.e. some fields in DB1 tables do NOT exist in DB2 tables

regards,

G.J.
0
 
LVL 39

Accepted Solution

by:
Roger Baklund earned 1600 total points
ID: 22936849
Ok, try the code below. Backup your tables first, this code is poorly tested!

Note that all rows from each table is loaded at once, one table at the time. This requires a lot of RAM, but from the numbers you have provided, it should not be a problem (3*255*10K < 8M).

It will take a while to run it, depending on your bandwidth and load on all three servers involved.
<?php
// enable PHP error reporting
ini_set('display_errors', 1);
error_reporting(E_ALL);
 
if(isset($_POST['sync'])) {
  set_time_limit(0); # no time limit
  echo '<html><head><title>db sync</title></head><body>
<h1>db sync</h1>';
  // connect
  $db1 = mysql_connect('host1','user1','password1') 
    or die(mysql_error());
  if(!mysql_select_db('dbname1',$db1)) 
    die('Could not select database 1');
  $db2 = mysql_connect('host2','user2','password2')
    or die(mysql_error());
  if(!mysql_select_db('dbname2',$db2))
    die('Could not select database 2');
 
  // db schema
  $schema1 = array(
    array('table'=>'TABLE_d1_a','key'=>'Field_d1_a1',
          'fields'=>array('Field_d1_a2','Field_d1_a3')),
    array('table'=>'TABLE_d1_b','key'=>'Field_d1_b1',
          'fields'=>array('Field_d1_b2','Field_d1_b3','Field_d1_b4')),
    array('table'=>'TABLE_d1_d','key'=>'Field_d1_d1',
          'fields'=>array('Field_d1_d3','Field_d1_d4')),
  );
  $schema2 = array(
     array('table'=>'TABLE_d2_a','key'=>'Field_d2_a1',
           'fields'=>array('Field_d2_a2','Field_d2_a3')),
     array('table'=>'TABLE_d2_b','key'=>'Field_d2_b1',
           'fields'=>array('Field_d2_b2','Field_d2_b3','Field_d2_b4')),
     array('table'=>'TABLE_d2_d','key'=>'Field_d2_d1',
           'fields'=>array('Field_d2_d3','Field_d2_d4')),
  );
 
  // sync
  if(isset($_POST['add'])) {
    for($i=0;$i<count($schema1);$i++) {      
      echo 'Fetching rows from '.$schema1[$i]['table'].'<br />';
      $res = mysql_query('select * from '.$schema1[$i]['table'].' order by '.$schema1[$i]['key'],$db1);
      echo mysql_num_rows($res).' rows fetched<br />';
      $affected = 0;
      while($row = mysql_fetch_assoc($res)) {
        $data = array();
        foreach($schema1[$i]['fields'] as $field) 
          $data[] = mysql_real_escape_string($row[$field]);
        $query = 'insert ignore into '.$schema2[$i]['table'].' ('.$schema2[$i]['key'].','.
          implode(',',$schema2[$i]['fields']).') values ('.$row[$schema1[$i]['key']].',"'.
          implode('","',$data).'")';
        $res2 = mysql_query($query,$db2);
        if(!$res2) die('<br />Insert query failed!<br />Query:'.htmlentities($query).'<br />Error:'.mysql_error());
        $affected+=mysql_affected_rows($db2);
      }
      mysql_free_result($res);
      echo 'Inserted rows: '.$affected.'<br />';
    }
  } elseif(isset($_POST['update'])) {
    for($i=0;$i<count($schema1);$i++) {      
      echo 'Fetching rows from '.$schema1[$i]['table'].'<br />';
      $res = mysql_query('select * from '.$schema1[$i]['table'].' order by '.$schema1[$i]['key'],$db1);
      echo mysql_num_rows($res).' rows fetched<br />';
      $affected = 0;
      while($row = mysql_fetch_assoc($res)) {
        $data = array();
        for($j=0;$j<count($schema2[$i]['fields']);$j++) 
          $data[] = $schema2[$i]['fields'][$j].'="'.mysql_real_escape_string($row[$schema1[$i]['fields'][$j]]).'"';
        $query = 'update '.$schema2[$i]['table'].' set '.
          implode(',',$data).' where '.$schema2[$i]['key'].'='.$row[$schema1[$i]['key']];
        $res2 = mysql_query($query,$db2);
        if(!$res2) die('Update query failed!<br />Query:'.htmlentities($query).'<br />Error:'.mysql_error());
        $affected+=mysql_affected_rows($db2);
      }
      mysql_free_result($res);
      echo 'Updated rows: '.$affected.'<br />';
    }    
  }
  echo '<a href="'.$_SERVER['SCRIPT_NAME'].'">back</a>';
  echo '</body></html>';
} else { 
?><html><head><title>db sync</title></head><body>
<h1>db sync</h1>
<form method="post" action="<?php echo $_SERVER['SCRIPT_NAME']?>">
<input type="hidden" name="sync" value="1" />
<input type="submit" name="update" value="update DB2 to match the associated fields in DB1"/>
<input type="submit" name="add" value="ADD any new records to DB2, that exist in DB1"/>
</form></body></html>
<?php } ?>

Open in new window

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:ggjones
ID: 22938894
.. this truly is a beautiful thing.... thanks for putting your head around it.

I see what you mean about resource use. But if one is to compare all fields in all Tables, are there any alternative methods?

While the high-level logic is clear to me, In terms of testing/de-bugging, the devil will certainlly be in the details. I would expect to have a system built sufficiant to test this soution over the next ten days or so, whereupon, it will either work seamlessly... or it won't, and I'll need to report back error messsages.

regards,

G.J.
0
 

Author Comment

by:ggjones
ID: 23070375
This works brilliantly ( albeit on a small 300k test DB). - thank you very much cxr

I'm reminded however, that I need a third mode of update - the ability to sync records by specific Business.

For example, the attached arrays describe the specific fields in question.
The BUSINESS table PK is 'business_id'. Most other tables include 'business_id_fk'.

If I select a specific Business from a drop-down menu, is there a way to then sync the 2 DBs only for records with that ID, across all tables?

regards, GJ
  // db schema
  $schema1 = array(
    array('table'=>'business','key'=>'business_id',
          'fields'=>array('business_date_modified','business_name','business_nickname','business_city','business_category','business_keyphrase_1','business_keyphrase_2','business_url','business_show')),
    array('table'=>'branding','key'=>'branding_id',
          'fields'=>array('branding_date_modified','branding_title_suffix','branding_description_prefix','branding_description_suffix')),
    array('table'=>'content','key'=>'content_id',
          'fields'=>array('content_date_modified','business_id_fk','content_title','content_description','content_keywords','content_script')),
    array('table'=>'keyphrases','key'=>'keyphrase_id',
          'fields'=>array('business_id_fk','keyphrase_words')),
    array('table'=>'search','key'=>'search_id',
          'fields'=>array('search_date_modified','business_id_fk','search_title','search_show','search_entry')),
    array('table'=>'location','key'=>'location_id',
          'fields'=>array('location_date_modified','business_id_fk','location_name','location_unit_show','location_unit','location_street_address','location_po_box_show','location_po_box','location_city','location_prov','location_postal_code','location_phone_1_show','location_phone_1','location_website_link','location_website_url','location_map_show','location_map_latitude','location_map_longitude')),
  );

Open in new window

0
 

Author Closing Comment

by:ggjones
ID: 31515716
Thanks cxr ... this works well. I have a follow-up question posted about a possible refinement - would you mind reviewing it?  regards, GJ
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 23087160
Yes, that is possible. Add a dropdown to select the business, and append a WHERE clause to the SELECT statements if a business was selected and the table specific fields array contains the foreign key (only the branding table does not).
0
 

Author Comment

by:ggjones
ID: 23087573
thanks for responding cxr...

so if I understand you then, this is the type of thing:

$res = mysql_query('select * from '.$schema1[$i]['table'].' order by '.$schema1[$i]['key'] WHERE (business_id_fk = '. $myVar .' OR business_id = '. $myVar) ,$db1);

But.. will this create an error for tables that have only one or none of business_id_fk OR business_id ?
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 23087728
Yes, you must test if there is a fk column in that table, and the WHERE clause must be before the ORDER BY clause:  SELECT ... FROM ... WHERE ... ORDER BY ...

If the posted business_id is stored as $business_id, something like this (not tested):
$res = mysql_query('select * from '.$schema1[$i]['table'].
  (($business_id and in_array('business_id_fk',$schema1[$i]['fields']))?
  ' where business_id_fk='.$business_id:'').
  ' order by '.$schema1[$i]['key'],$db1);

Open in new window

0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 23087813
hm... that only works for the tables with business_id_fk, not for the main business table. Maybe it is better to split this statement into parts (still not tested):
$sql = 'select * from '.$schema1[$i]['table'];
if($business_id > 0) {
  if(in_array('business_id_fk',$schema1[$i]['fields'])) 
    $sql.=' where business_id_fk='.$business_id;
  else if(in_array('business_id',$schema1[$i]['fields'])) 
    $sql.=' where business_id='.$business_id;
}
$res = mysql_query($sql.' order by '.$schema1[$i]['key'],$db1);  

Open in new window

0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 23093740
A small modification, this should work better:
$sql = 'select * from '.$schema1[$i]['table'];
if($business_id > 0) {
  if(in_array('business_id_fk',$schema1[$i]['fields'])) 
    $sql.=' where business_id_fk='.$business_id;
  else if($schema1[$i]['key']=='business_id')) 
    $sql.=' where business_id='.$business_id;
}
$res = mysql_query($sql.' order by '.$schema1[$i]['key'],$db1);  

Open in new window

0
 

Author Comment

by:ggjones
ID: 23094937
Thank you cxr, this seems to work fine, after brief testing (with a small DB).

I do have one question though - notice that with both UPDATE and ADD, the "branding" table is still being "fetched from", despite there being no business_id or business_id_fk field in that table.

Is there an efficiency to be be made , to exclude the "branding" table when querying is restricted to business_id?
UPDATE by business_ID :
 
Fetching records from branding
8 records fetched
Existing Records Updated: 0
Fetching records from business
1 records fetched
Existing Records Updated: 1
Fetching records from content
1 records fetched
Existing Records Updated: 0
Fetching records from keyphrases
1 records fetched
Existing Records Updated: 0
Fetching records from search
1 records fetched
Existing Records Updated: 0
Fetching records from location
1 records fetched
Existing Records Updated: 0
 
 
ADD by business_ID :
 
Fetching records from branding
8 records fetched
New Records Added: 0
Fetching records from business
1 records fetched
New Records Added: 0
Fetching records from content
1 records fetched
New Records Added: 0
Fetching records from keyphrases
2 records fetched
New Records Added: 1
Fetching records from search
1 records fetched
New Records Added: 0
Fetching records from location
1 records fetched
New Records Added: 0

Open in new window

0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 23095218
The easiest solution is to use an "untrue" expression, so that no records are fetched:
$sql = 'select * from '.$schema1[$i]['table'];
if($business_id > 0) {
  if(in_array('business_id_fk',$schema1[$i]['fields'])) 
    $sql.=' where business_id_fk='.$business_id;
  else if($schema1[$i]['key']=='business_id')) 
    $sql.=' where business_id='.$business_id;
  else 
    $sql.=' where 1=2';  # allways false
}
$res = mysql_query($sql.' order by '.$schema1[$i]['key'],$db1);  

Open in new window

0
 

Author Comment

by:ggjones
ID: 23095786

yes, of course!
The thing works fabulously now... again, thank you very much cxr.

regards,

GJ.
0
 
LVL 36

Expert Comment

by:gr8gonzo
ID: 26023426
Just a quick note - if you're looking to synchronize an entire database or selection of databases, then using dual-master replication is a far more efficient and easy way to go, and it's built into MySQL. Just google around for "dual-master replication mysql" and there are several guides to setting it up.
0
 

Author Comment

by:ggjones
ID: 26023470
thanks for the reply gr8gonzo..

Its not quite the entire DB, rather a subset of fields from a subset of tables (in order to reduce overhead)

regards,

GJ
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

3 proven steps to speed up Magento powered sites. The article focus is on optimizing time to first byte (TTFB), full page caching and configuring server for optimal performance.
Sometimes MS breaks things just for fun... In Access 2003, only the maximum allowable SQL string length could cause problems as you built a recordset. Now, when using string data in a WHERE clause, the 'identifier' maximum is 128 characters. So, …
The viewer will learn how to dynamically set the form action using jQuery.
With just a little bit of  SQL and VBA, many doors open to cool things like synchronize a list box to display data relevant to other information on a form.  If you have never written code or looked at an SQL statement before, no problem! ...  give i…
Suggested Courses

581 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question