We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now

x

Distributed Database problem

Arnabtech
Arnabtech asked
on
Medium Priority
450 Views
Last Modified: 2012-05-11
I have 10 machines where DBMS ( Postgre SQL) is installed. There are agent programs , which populates those database. I want to build some report and display in web.

Agents inserts/ updates 1000 records/sec in each database. How can I use a centralized database where all the agent database will send data to the central database ? Is there any automatic mechanism to do that ? I need to synchronize them every minute.

Does any other database supports this ?
Comment
Watch Question

CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
mysql is one. But monitor must be setup to detect a conflict as early as possible.



node A is a master to Node B as well as being a slave to Node D.
Similarly node C is a master to Node D while being a slave to node B
http://dev.mysql.com/tech-resources/articles/advanced-mysql-replication.html
http://blog.andrewrose.co.uk/2008/01/mysql-recovering-failed-circular.html


The example for mysql deals with two master/master setup. This can be extended up to 10 nodes. note the offset and increment which makes sure that there is no id when a new record is added that can overlap with an id when a record is added on the other nodes.
1, 11, 21, etc.
while the others will have
2,12,22, etc/
.
.
.9,19,29

See whether the same option that exists to setup circular replication is an option for psotgreSQL
http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
http://www.postgresql.org/docs/9.0/static/high-availability.html

Author

Commented:
Are you talking about load balancing ?

In my case that's not the problem. I have 10 database servers. I want to search a record in all database servers and also want to prepare some report based on the data of all the servers.

Now , I can write some program which will collect data from the servers and populate in a central DB server. So that I can use the central DB server from my reporting . I want this job to be automatic so that I should not worry about synchronization, polling the database.
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
The setup will provide load balancing while replicating data.

Do all the servers have the same database/structure/schema?

a perl/c/c++ can be coded to generate queries against each database server and either build the results in yet another database server or ....

What is the relationship among the database servers/databases?

Author

Commented:
All the agent database servers have same schema.

Querying each database server and then populating the central server is the worst thing I want to do. That's why I am looking for better solution.

All agent database servers gets some sensor data. Each of them gets data from different sensors. Normally each database server gets data from 1000 sensors. As a result minimum 1000 records get inserted in each database. 10 * 1000 in total .

I need to get all of them in central repository so that I can keep the agent database free for more data . Also I can query in a single server.
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
Circular replication as referenced in with mysql as the database.
Data gets added and then retrieved by the subordinated servers such that all the database servers have databases with the same information.
There is no need for an additional server that will be "aggregating/collecting" individually.

I do not know whether you can configure one server (collector) to be a a subordiante/slave from others.
The problem you face is that they have the same database name, and precautions have to be taken to make sure that you have the increment and offset set such that there is no way not even a remote possibility that there will be an overlap in the ID column for any record in any table.

http://lists.mysql.com/replication/706 Old article (2005) referenced that multi-master might be available in 5.1. The discussion/exchange on this issue is at http://lists.mysql.com/replication/706 
If the setup is default that means you have entries in each database/table that have overlapping primary keys.
CERTIFIED EXPERT
Commented:
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview

Author

Commented:
I need to study the Data Partitioning area.
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a free trial preview!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.