Distributed Database problem

I have 10 machines where DBMS ( Postgre SQL) is installed. There are agent programs , which populates those database. I want to build some report and display in web.

Agents inserts/ updates 1000 records/sec in each database. How can I use a centralized database where all the agent database will send data to the central database ? Is there any automatic mechanism to do that ? I need to synchronize them every minute.

Does any other database supports this ?
ArnabtechAsked:
Who is Participating?
 
earth man2Commented:
You tagged your question with postgresql so I assume that's what you're interested in.
Latest version has streaming replication built in using log shipping.

http://www.postgresql.org/docs/9.1/static/warm-standby.html#STREAMING-REPLICATION

Solutions like pgpool and Slony are outlined here.

http://www.postgresql.org/docs/9.1/static/different-replication-solutions.html
0
 
arnoldCommented:
mysql is one. But monitor must be setup to detect a conflict as early as possible.



node A is a master to Node B as well as being a slave to Node D.
Similarly node C is a master to Node D while being a slave to node B
http://dev.mysql.com/tech-resources/articles/advanced-mysql-replication.html
http://blog.andrewrose.co.uk/2008/01/mysql-recovering-failed-circular.html


The example for mysql deals with two master/master setup. This can be extended up to 10 nodes. note the offset and increment which makes sure that there is no id when a new record is added that can overlap with an id when a record is added on the other nodes.
1, 11, 21, etc.
while the others will have
2,12,22, etc/
.
.
.9,19,29

See whether the same option that exists to setup circular replication is an option for psotgreSQL
http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
http://www.postgresql.org/docs/9.0/static/high-availability.html
0
 
ArnabtechAuthor Commented:
Are you talking about load balancing ?

In my case that's not the problem. I have 10 database servers. I want to search a record in all database servers and also want to prepare some report based on the data of all the servers.

Now , I can write some program which will collect data from the servers and populate in a central DB server. So that I can use the central DB server from my reporting . I want this job to be automatic so that I should not worry about synchronization, polling the database.
0
Network Scalability - Handle Complex Environments

Monitor your entire network from a single platform. Free 30 Day Trial Now!

 
arnoldCommented:
The setup will provide load balancing while replicating data.

Do all the servers have the same database/structure/schema?

a perl/c/c++ can be coded to generate queries against each database server and either build the results in yet another database server or ....

What is the relationship among the database servers/databases?
0
 
ArnabtechAuthor Commented:
All the agent database servers have same schema.

Querying each database server and then populating the central server is the worst thing I want to do. That's why I am looking for better solution.

All agent database servers gets some sensor data. Each of them gets data from different sensors. Normally each database server gets data from 1000 sensors. As a result minimum 1000 records get inserted in each database. 10 * 1000 in total .

I need to get all of them in central repository so that I can keep the agent database free for more data . Also I can query in a single server.
0
 
arnoldCommented:
Circular replication as referenced in with mysql as the database.
Data gets added and then retrieved by the subordinated servers such that all the database servers have databases with the same information.
There is no need for an additional server that will be "aggregating/collecting" individually.

I do not know whether you can configure one server (collector) to be a a subordiante/slave from others.
The problem you face is that they have the same database name, and precautions have to be taken to make sure that you have the increment and offset set such that there is no way not even a remote possibility that there will be an overlap in the ID column for any record in any table.

http://lists.mysql.com/replication/706 Old article (2005) referenced that multi-master might be available in 5.1. The discussion/exchange on this issue is at http://lists.mysql.com/replication/706 
If the setup is default that means you have entries in each database/table that have overlapping primary keys.
0
 
ArnabtechAuthor Commented:
I need to study the Data Partitioning area.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.