[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Database replication questions

Posted on 2009-12-19
6
Medium Priority
?
432 Views
Last Modified: 2013-12-09
Hi guys, right now I am reading the documentation of some FB replication engines. The type of the replications is two way with logs. There are some things I don't fully understand.
1. It is sertain, that the PKs created by each slave must be unique. For example the first will create PKs starting from "1", the second starts with "1 000 000 000" etc. Wht I don't understand is that some replication eingins like FiBRE, say you need to have unique PK for all the tables. For exmple on table "customers" you create a record with pk = 1 and then on the table "clients" the first row must be pk = 2. Is this really necessary? I think its not. Tell me pls.

2. The second thing that can't get out of my head is the so called "Update Conflict". For examle the slave 1 creates a record with values 1 2 3, and replicates them to the server. Slave 2 synchronises its DB and changes the field 2 to 5 and replicates. then the first slave changes the value and replicates. And BOOM, here comes the problem, cos if slave 1 has query of "update client set name = 'Peter' where name = 'Not Peter'" but the second slave has changed the value of "Not Peter" to "John". When the first one replicates the DB this update statement will change nothing. Pls give me a solution to this problem.
0
Comment
Question by:IncognitoMan
  • 3
  • 3
6 Comments
 
LVL 43

Expert Comment

by:pcelba
ID: 26087212
1. This is really not necessary but I know one application which requires it... It uses PK from all tables as PK in common DeleteLog. Easy to fix but it still works this way. On the other hand sometimes the PKs are generated by one global function which does not recognize table names. And they are unique across the whole app.

2. Update conflict is obvious in systems which are synchronizing data in both directions. Simple rule says: The last is the winner.  The replication does not look at old data obviously and it does not send SQL commands. It sends and updates the whole row based on PK and timestamp. Of course, if you use replication system based on SQL commands replication then it will work as you described. Such system is dangerous because it can easily cause data integrity problems.

0
 

Author Comment

by:IncognitoMan
ID: 26092797
Thank you pcelba,
so you tell me "the last is the winner". This sounds like a good solution.
Let's see if I get it right.
First you create a table SYSLOG{pk, PCnumber, time, query, pk_of_query, data of BLOB type}, and another table SYSTRANS in which we will have the records that have been repicated.
After that you create trigers for all the tables that will insert data in SYSLOG table.
Then you create a program, that will take from the server all the records, that have been inserted deleted or inserted by a different SYSLOG.PCnumber than you and doesnt exist in your SYSTRANS table and send them to the server, then put their PKs into SYSTRANS table.
On the server there will be also have those two tables, but the SYSTRANS table will have the ID of the computer and the pk of the record that have been replicated, so that the server can rememmber which record to which computer was sent. Then insert a record into the server's SYSLOG and SYSTRANS and exequte the query.

But here comes the funny part.
If the operation is Update when the client 1 takes the records from the server and sees they've been updated by another client 2, but the record has been created by client 1. If it has, it will not take the record because in the server there will be the record of client 1 but, client 1 will have the record of client 2 :) which is stupid. The same thing need to be done with the delete statement, but maybe not cos you will just try to delete an unexisting record.
0
 
LVL 43

Accepted Solution

by:
pcelba earned 2000 total points
ID: 26092897
Your scenario is possible and it will always cause similar questions...

You may solve it e.g. by timestamps. If you refuse updates having older timestamps than existing records then all records will synchronize. Time must be synchronized on all computers which is not problem today.

Another solution is based on timing of your update operations. If you decide "Process incomming data first then create outgoing batch" all data will be synchronized but not all users which are simultaneously updating same records on different client sites will see their changes later because they will be updated (synchronized) by contents incomming from the server.

The problem can be slightly reduced if you don't synchronize whole records but single columns only. It, of course, means additional coding.

Deletion is not a problem on the first view only. You should reduce records deleting and replace it by records expiration (means new column containing ExpirationDate). In such case you are sure you don't delete records containing foreign keys used "somewhere else".

Each replication is somehow specific and you'll see many additional problems with the first data.
0
NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

 

Author Comment

by:IncognitoMan
ID: 26100536
pcelba,
Pls explain me this "Process incomming data first then create outgoing batch".
Well seems this timestamp method is the best. Cos in my solution the "winner" will be not the last one updated the record,, but the last one tha made a replication.
If you have more solutions pls share them with me. :)
If you know more problems I will face durring DB replications tell me about them.
0
 

Author Closing Comment

by:IncognitoMan
ID: 31668070
Interesting ideas for people that understand things and don't like the existing solutions. Thanks pcelba.
0
 
LVL 43

Expert Comment

by:pcelba
ID: 26101033
"Process incomming data first then create outgoing batch" - this is all about batch synchronization. If you decide to process sync data each evening on the Server then create outgoing data for clients and they will process them morning before the work starts everything should by OK. This loop can be done with shorter period, of course. Depends on the network topology and possibilities. Sometimes there are more client levels or clients are off-line during the day etc.

We have implemented another layer to data synchronization - data filtering. Each client has rights to see just a subset of server data based on various conditions and data properties. E.g. Customers are split among different territories and each client can see just a few of them. This will reduce the risk of data loss (or fraud) and it also reduces the amount of transferred data. The code behind depends on implementation.

Additional option should be "Sync ALL data". It is usefull when new client is installed or connected to the network.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article, we’ll look at how to deploy ProxySQL.
One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …
Suggested Courses

873 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question