Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium


Distributed Database problem

Posted on 2011-04-29
Medium Priority
Last Modified: 2012-05-11
I have 10 machines where DBMS ( Postgre SQL) is installed. There are agent programs , which populates those database. I want to build some report and display in web.

Agents inserts/ updates 1000 records/sec in each database. How can I use a centralized database where all the agent database will send data to the central database ? Is there any automatic mechanism to do that ? I need to synchronize them every minute.

Does any other database supports this ?
Question by:Arnabtech
  • 3
  • 3
LVL 81

Expert Comment

ID: 35495769
mysql is one. But monitor must be setup to detect a conflict as early as possible.

node A is a master to Node B as well as being a slave to Node D.
Similarly node C is a master to Node D while being a slave to node B

The example for mysql deals with two master/master setup. This can be extended up to 10 nodes. note the offset and increment which makes sure that there is no id when a new record is added that can overlap with an id when a record is added on the other nodes.
1, 11, 21, etc.
while the others will have
2,12,22, etc/

See whether the same option that exists to setup circular replication is an option for psotgreSQL

Author Comment

ID: 35607924
Are you talking about load balancing ?

In my case that's not the problem. I have 10 database servers. I want to search a record in all database servers and also want to prepare some report based on the data of all the servers.

Now , I can write some program which will collect data from the servers and populate in a central DB server. So that I can use the central DB server from my reporting . I want this job to be automatic so that I should not worry about synchronization, polling the database.
LVL 81

Expert Comment

ID: 35610403
The setup will provide load balancing while replicating data.

Do all the servers have the same database/structure/schema?

a perl/c/c++ can be coded to generate queries against each database server and either build the results in yet another database server or ....

What is the relationship among the database servers/databases?
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!


Author Comment

ID: 35688125
All the agent database servers have same schema.

Querying each database server and then populating the central server is the worst thing I want to do. That's why I am looking for better solution.

All agent database servers gets some sensor data. Each of them gets data from different sensors. Normally each database server gets data from 1000 sensors. As a result minimum 1000 records get inserted in each database. 10 * 1000 in total .

I need to get all of them in central repository so that I can keep the agent database free for more data . Also I can query in a single server.
LVL 81

Expert Comment

ID: 35689675
Circular replication as referenced in with mysql as the database.
Data gets added and then retrieved by the subordinated servers such that all the database servers have databases with the same information.
There is no need for an additional server that will be "aggregating/collecting" individually.

I do not know whether you can configure one server (collector) to be a a subordiante/slave from others.
The problem you face is that they have the same database name, and precautions have to be taken to make sure that you have the increment and offset set such that there is no way not even a remote possibility that there will be an overlap in the ID column for any record in any table.

http://lists.mysql.com/replication/706 Old article (2005) referenced that multi-master might be available in 5.1. The discussion/exchange on this issue is at http://lists.mysql.com/replication/706 
If the setup is default that means you have entries in each database/table that have overlapping primary keys.
LVL 22

Accepted Solution

earth man2 earned 750 total points
ID: 35726487
You tagged your question with postgresql so I assume that's what you're interested in.
Latest version has streaming replication built in using log shipping.


Solutions like pgpool and Slony are outlined here.


Author Closing Comment

ID: 35758732
I need to study the Data Partitioning area.

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Instead of error trapping or hard-coding for non-updateable fields when using QODBC, let VBA automatically disable them when forms open. This way, users can view but not change the data. Part 1 explained how to use schema tables to do this. Part 2 h…
How much do you know about the future of data centers? If you're like 50% of organizations, then it's probably not enough. Read on to get up to speed on this emerging field.
Steps to create a PostgreSQL RDS instance in the Amazon cloud. We will cover some of the default settings and show how to connect to the instance once it is up and running.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Suggested Courses

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question