Distributed Database problem

Posted on 2011-04-29
Last Modified: 2012-05-11
I have 10 machines where DBMS ( Postgre SQL) is installed. There are agent programs , which populates those database. I want to build some report and display in web.

Agents inserts/ updates 1000 records/sec in each database. How can I use a centralized database where all the agent database will send data to the central database ? Is there any automatic mechanism to do that ? I need to synchronize them every minute.

Does any other database supports this ?
Question by:Arnabtech
    LVL 76

    Expert Comment

    mysql is one. But monitor must be setup to detect a conflict as early as possible.

    node A is a master to Node B as well as being a slave to Node D.
    Similarly node C is a master to Node D while being a slave to node B

    The example for mysql deals with two master/master setup. This can be extended up to 10 nodes. note the offset and increment which makes sure that there is no id when a new record is added that can overlap with an id when a record is added on the other nodes.
    1, 11, 21, etc.
    while the others will have
    2,12,22, etc/

    See whether the same option that exists to setup circular replication is an option for psotgreSQL,_Clustering,_and_Connection_Pooling

    Author Comment

    Are you talking about load balancing ?

    In my case that's not the problem. I have 10 database servers. I want to search a record in all database servers and also want to prepare some report based on the data of all the servers.

    Now , I can write some program which will collect data from the servers and populate in a central DB server. So that I can use the central DB server from my reporting . I want this job to be automatic so that I should not worry about synchronization, polling the database.
    LVL 76

    Expert Comment

    The setup will provide load balancing while replicating data.

    Do all the servers have the same database/structure/schema?

    a perl/c/c++ can be coded to generate queries against each database server and either build the results in yet another database server or ....

    What is the relationship among the database servers/databases?

    Author Comment

    All the agent database servers have same schema.

    Querying each database server and then populating the central server is the worst thing I want to do. That's why I am looking for better solution.

    All agent database servers gets some sensor data. Each of them gets data from different sensors. Normally each database server gets data from 1000 sensors. As a result minimum 1000 records get inserted in each database. 10 * 1000 in total .

    I need to get all of them in central repository so that I can keep the agent database free for more data . Also I can query in a single server.
    LVL 76

    Expert Comment

    Circular replication as referenced in with mysql as the database.
    Data gets added and then retrieved by the subordinated servers such that all the database servers have databases with the same information.
    There is no need for an additional server that will be "aggregating/collecting" individually.

    I do not know whether you can configure one server (collector) to be a a subordiante/slave from others.
    The problem you face is that they have the same database name, and precautions have to be taken to make sure that you have the increment and offset set such that there is no way not even a remote possibility that there will be an overlap in the ID column for any record in any table. Old article (2005) referenced that multi-master might be available in 5.1. The discussion/exchange on this issue is at
    If the setup is default that means you have entries in each database/table that have overlapping primary keys.
    LVL 22

    Accepted Solution

    You tagged your question with postgresql so I assume that's what you're interested in.
    Latest version has streaming replication built in using log shipping.

    Solutions like pgpool and Slony are outlined here.

    Author Closing Comment

    I need to study the Data Partitioning area.

    Featured Post

    Find Ransomware Secrets With All-Source Analysis

    Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

    Join & Write a Comment

    Many developers have database experience, but are new to PostgreSQL. It has some truly inspiring capabilities. I have several years' experience with Microsoft's SQL Server. When I began working with MySQL, I wanted a quick-reference to MySQL (htt…
    Creating and Managing Databases with phpMyAdmin in cPanel.
    Steps to create a PostgreSQL RDS instance in the Amazon cloud. We will cover some of the default settings and show how to connect to the instance once it is up and running.
    Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

    755 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    20 Experts available now in Live!

    Get 1:1 Help Now