Solved

The multi-master by mulit vendor database, how to do it ?

Posted on 2013-10-23
27
488 Views
Last Modified: 2014-02-10
Dear all,

right now we have an issue that we would like to write in multi master topology that, in each office, we have a master DB and each office may write data to the same table in their office and finally data integrity must be keep.

this is really mulit master but in differnet locatoin and not in a cluster environmet, how can we do it?

hardware and software, please suggest.

the same table in each database on each location must have data integrity, no data can be destory.

by:

1) Change table design by adding one more time stamp column so no matter what database vendor it is, we use the timestamp to make decision on which to keep ?

2) hardware applicatant ?

3) BIG data?
0
Comment
Question by:marrowyung
  • 8
  • 4
  • 3
  • +4
27 Comments
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39596976
You asked this in 3 different database zones.  

How to do this will depend specifically on the database product you are using.

Which database is this for?
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39599821
"You asked this in 3 different database zones.  "

Yes.

DB2, MySQL, Oracle and MSSQL.
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39600172
In Oracle it is either Multi-Master Replication or Streams Replication.

The online docs talk about this:
http://docs.oracle.com/cd/E11882_01/server.112/e10707/raroverview.htm#i12897

http://docs.oracle.com/cd/E11882_01/server.112/e10705/prep_rep.htm#i1007573
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39602302
one thing, we knew the replication can help, but the point is, what is one master is Oracle and one master is MS SQL and one master is MySQL, etc. ?
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39605455
I'm not sure what products are out there to handle replication across heterogeneous systems.

I will send out a call for help to see if we can get some additional Experts involved.
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39605490
Why would you want to try replication across dissimilar systems?  What is the business reason for this?  I think it will introduce a bewildering amount of technical complexity and will significantly raise the cost of implementation, so I would want to have a very strong business case before I proposed doing something like this.

Is this a government project, by any chance?
0
 
LVL 34

Assisted Solution

by:gr8gonzo
gr8gonzo earned 166 total points
ID: 39605631
There really is no way to do multi-master with different database types:

1. They do not all have the same data types. For example, MS-SQL has datetime2, which you are not going to find in MySQL. And even when there are compatible data types, they often have different names or formats for inserting into the database.

2. They all have slightly different ways of querying, and different replication indexes / logs. They simply speak different languages.

It's like trying to write an English sentence in Russian and Japanese, but you want to keep the letters the same and still allow Russians and Japanese to read the sentence in their native language. It just doesn't work like that.

What's worse, multi-master means that you want the other masters to have the same capabilities of spreading their info to the other systems.

The only way to really accomplish that goal is to write a front-end program that gives each office the same form, but sends the data to all of the different systems. So there would be no true replication at the database level - only simulated replication via code.

All that said, this seems like a terrible idea. No offense - I'm sure it was not your initial setup or plan, but if I were in your shoes, I would start thinking about a plan to get everyone onto the same database. It might be more painful and complex at the beginning, but it will simplify things later on!

Also, in case each office has different applications that require different databases, there is no reason you cannot run two different types of databases simultaneously at one office. I've run Oracle and MySQL on the same server before, and I've also run MSSQL and MySQL on the same server before, so I know there are no technical conflicts there. Perhaps that would be an option for the sake of getting specific information into the same database type everywhere?
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39605661
@gr8gonzo: This might be the right time to test PDO's claims about the data-access abstraction layer!

:-)

~Ray
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 39605691
Yeah, I was thinking about PDO when I wrote that. I just didn't want to encourage it, since having different database types at different locations seems like it could cause a lot of future trouble for the OP.
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39605810
Agreed.  I would rather have someone else test it, not me!
0
 
LVL 34

Assisted Solution

by:Gary Patterson
Gary Patterson earned 166 total points
ID: 39606928
Agree this is ugly, but in the real world, sometimes you just have to deal with legacy systems, and you end up with requirements like this.  I work with a number of large enterprise clients - many have grown through acquisition, and they face these issues pretty frequently.  

I've seen three basic approaches:

1) Consolidate to a single database platform, and utilize the native replication tools in the selected platform.

2) Write custom applications to synchronize data between the various databases.  I don't recommend this except for the simplest of replication chores.

3) Acquire a third-party data replication tool that supports all of your target platforms.  Lots of these out there.  I'd need to know a lot more about your requirements in order to make a specific recommendation.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 39607604
Big Job...

One question for you. Is the central database just a repository of all other data, or, being a "master" is the origin of some data elements (such as masterfiles).

Reason being is this a one way replication, or multi-path replication and if any one of the satellite DB's needs to "know" what is happening in other satellite DB's.

Might need to adopt a classic ETL approach.

Will have a lot more to say once you have clarified the flow...

Cheers,
Mark
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 1

Author Comment

by:marrowyung
ID: 39607728
slightwv,

Thanks, a big picture for everyone !!

Ray_Paseur,

"Why would you want to try replication across dissimilar systems?  What is the business reason for this? "

Yes, people all around the world want to write data to write to their OWN DB server even it is different technology.

Business user request that! to make them do not have data lag behind and editable in any time without any conflict and ensure the data integrity  !

Ray_Paseur and gr8gonzo,

you know each other ? sit close to each other everyday ?

gr8gonzo,

"What's worse, multi-master means that you want the other masters to have the same capabilities of spreading their info to the other systems."

Yes, this is what they want ! but by some sense, this is meaningful !

"The only way to really accomplish that goal is to write a front-end program that gives each office the same form, but sends the data to all of the different systems. So there would be no true replication at the database level - only simulated replication via code."

yeah, this is one of them, let them change code and handle in application level, some kind of .net that write down all DB server IP address ?

"All that said, this seems like a terrible idea. No offense - I'm sure it was not your initial setup or plan, but if I were in your shoes, I would start thinking about a plan to get everyone onto the same database. It might be more painful and complex at the beginning, but it will simplify things later on!
"

Yeah, all of the same DB, but they already using their own DB, they don't prefer change. or a lot of $$ is need /ony use MYSQL as it is cheap/free.

it was not my initial setup or plan, it just came out a week ago.


"Also, in case each office has different applications that require different databases, there is no reason you cannot run two different types of databases simultaneously at one office. I've run Oracle and MySQL on the same server before, and I've also run MSSQL and MySQL on the same server before, so I know there are no technical conflicts there. Perhaps that would be an option for the sake of getting specific information into the same database type everywhere? "

you mean the information still replicate to the same DB but tell all office to run all of them and probably use one application to read all of them in single console ?

what is the PDO and OP ?

Gary_The_IT_Pro,

"work with a number of large enterprise clients - many have grown through acquisition, and they face these issues pretty frequently.  
"

exactly what we are doing now !!

"1) Consolidate to a single database platform, and utilize the native replication tools in the selected platform."

yes! as long as they get $$

"2) Write custom applications to synchronize data between the various databases.  I don't recommend this except for the simplest of replication chores."

Why ? some one propose the MySQL cluster and let all MYSLQ stay inside, so data repliate to the rest of DB, I don't know how !

"3) Acquire a third-party data replication tool that supports all of your target platforms.  Lots of these out there.  I'd need to know a lot more about your requirements in order to make a specific recommendation. "

you are very good on this I think, redgate toolbelts? that one seems only for MS SQL.

Basically the requirement is as what I post, same data on the same table replicate in near real time to all different database technology.

they just want this !

Please show me those approach !

mark_wills,

long time no see and I miss you, whenever I see you I learn a lot, where have you been? hope everything is going well with you.

"Is the central database just a repository of all other data, or, being a "master" is the origin of some data elements (such as masterfiles)."

I not quite sure what is that mean, this will be a multi-path replication, to other other DB technology.

That's why I am talking about the same table name with once extra timestamp column, then some method to replication based on this timestamp column.

"if any one of the satellite DB's needs to "know" what is happening in other satellite DB's.
"

I think this is what the mulit-master can do about, if that Oracle table's data has change, then write to the rest of the same table in the rest of the DB, some kind of on the fire trigger !?? then we case it down to AFTER INSERT/UPDATE and BEFORE DELETE ?  we just need to update the other same table in other DB.

BTW, mark_wills, when I talk about trigger, I have opened other ticket that talking about how to do data level auditing on MySQL and Oracle, please help on contributing.

This time I need to build a SP to AUTOMATE the whole manual AFTER INSERT/UPDATE and BEFORE trigger in MySQL, manuall script is there already.
0
 
LVL 51

Accepted Solution

by:
Mark Wills earned 168 total points
ID: 39608567
@marrowyung, been good and a little bit preoccupied, thanks for asking :)

Well, there is NO tool I know of that is capable of full two-way automatic replication across multiple (ie DB2, MSSQL, ORACLE and MySQL) vendor databases. They all tend to use their own strategies to maximise performance and data integrity.

But there are 3rd party ETL type tools that can read / write and even compare data to any of those databases and some can orchestrate replication. But typically isn't instantaneous because of the need to match and cleanse data before synch. And that assumes that networking would support nearly "real time" otherwise you must go to an asynchronous model due to latency and holding all replication paths to ransom based on the slowest connection.

So, real-time synch with 3rd party tools is often an ambitious gambit unless absolutely everything is absolutely perfect and databases conform to common datatypes (or be smart enough to equate to lowest common denominators and precision levels).

The true extent of difficulty will depend on architecture and design and policy. Then the real difficulty comes into play with conflict resolution.

Some of the third party tools do offer real time synch (by way of example only : http://www.informatica.com/AU/solutions/enterprise-data-integration-and-management/data-synchronization/ and note that the "real time" is not the standard version or Quest's Toad : http://www.quest.com/toad-data-point/ but they offer the functionality of data synch and not necessarily real time)

The cost of those third party tools might be worth investigating additional license costs for one of your current DB suppliers.

Let's consider a few things before we get too far ahead of ourselves.

1) Network.
Is there sufficient speed and bandwidth between offices, or, assuming internet speeds are easier to increase, should a cloud based solution be considered.
Is part of the need for replication tied into HA or DR plans?
Is there any plan for failover support should one office go down ?
Can every other office be adequately "polled" from central office ?

2) Is the application suite in each office the same
In which case, is thin clients / central database a possibility.
If they are the same, is it a commercial pkg or developed in-house or a mixture
If a commercial pkg, are there any restrictions or support for changing structures or "back door" maintenance (ie replication)

3) Even though different manufacturers, do the databases share the same structures
Precision in columns such as numeric or datetime type structures or date and time as separate columns, or varchar(max) etc...
Use of identity as surrogate keys versus natural keys.
Referential integrity and check constraints.
Code rules (e.g. think of a masterfile named "Country" and what if 1 database has '61' as country code but the other has 'AUS')

4) Replication Frequency
Does it need to be synchronous (real time or near real time), asynchronous or even batch mode.
How often must the databases be guaranteed to be "in synch"
Are the databases in use 24x7 or is it more a 9-5 office environment

5) What is the role of the central "master"
So new customers (by way of example) can be entered once (and other masterfiles, creditors, chart of accounts etc...) and then distributed to all other databases.
To collect data from all the other databases as a central repository of data (like a data warehouse) for ease of reporting.
If so, what are the periodic reports and the granularity of that reporting frequency (on demand, daily, weekly, monthly etc).
Is transactional data (e.g. orders, sales etc) only created an the regional office level and then uploaded to central master - or - does it have to be propagated throughout all DB's.

6) Do offices have regional controls and considerations
Such as language, time zone differences, regional nuances or autonomy due to their own niche markets.

So... Lots of questions.

Any one of the above half a dozen "topics to consider" could have pages and pages of background goals and possibilities attached. And there are more topics that could be considered as you start to tune your requirements.

The "easy" answer is to throw money at the problem and get everyone on the same DB and use that DB's proprietary replication (and that replication in itself will not be without challenges). But that does assume that the application is the same across all offices with essentially the same DB structure. And if it is the same then the relative cost of extra user licenses accessing a central database is worth considering.

If replication can be "end of day" (ie overnight), then you may need to consider HA and DR during the day, but, then the replication part can be batched. That is when the datetime approach can work.

I think I have shared with you before a rather nasty environment that we had where some of the DB's where untethered (mobile or not on the same network). We had two date time columns. Date Last Modified and Date Sent. Any Add or Update cleared Date Sent which was the primary basis for extract. When successfully sent, it was updated. Date Last Modified was primarily for conflict resolution, and only applied to updates. We did have a third flag which was Delete Requested and used in place of physical deletes. The downside of that is your app needs to understand that delete requested items needs to be bypassed when displaying  data.

That can work, but is a big job and really only identifies data that needs to be distributed. There is still a potentially big job of transformation (such as the country code example above). And really, only works properly when you control the application space.

You can use triggers to populate an "audit" style database which records all the add, change, deleted items (much the same as SQL Server CDC strategies) and then just use that audit db as a basis for replication, but that is normally one-way replication (ie changes in remote office getting transferred to HO).

So, where to from here ?

Do an audit of what systems are being used in which location.
Network and hardware support in each location and back to Head Office.
Head counts in each office and the licenses they own.
Transaction volumes in each office.
Map "data ownership" showing which office is responsible for different data elements.
Frequency for replication.
Any Disaster Recovery or High Availability plans.
Of the existing databases, table by table reconciliation (highlight different elements)

Then start eliminating some techniques - such as synchronous replication if networked machines cannot talk to each other in real time.

Just to give you an idea of the possibilities have a look at some of the ETL tools : http://www.etltool.com/list-of-etl-tools/ and that is just an example, not a complete list.

Then consider some tools that can be used to help to identify differences : http://www.altova.com/diffdog.html

and multi platform SQL tools : http://www.razorsql.com/features.html

and some others that support real time data synch : http://www.jaspersoft.com/data-integration and http://integration.pervasive.com/products/data-integration-software.aspx and http://www.informatica.com/au/products/enterprise-data-integration/powercenter/powercenter-real-time-edition/ the last one (as with some of the others) have very interesting whitepapers.

and most importantly, talk to your current DB vendors and application suppliers and see what they would support or recommend (and if the app suppliers allow changes to their DB's). After all, you will need their input to help formulate your approach and proposal to management.

There is no "easy" answer, there are possibilities though, and it has been done before. The trick is finding out what suits your circumstances best and for that we need to know a lot more about your environment and requirements.
0
 
LVL 34

Expert Comment

by:Gary Patterson
ID: 39608795
2) Write custom applications to synchronize data between the various databases.  I don't recommend this except for the simplest of replication chores.

Why ? some one propose the MySQL cluster and let all MYSLQ stay inside, so data repliate to the rest of DB, I don't know how !

Why?  Just a few of the many, many reasons:

1) It is technically difficult.

It is hard to build a good, robust cross-platform data replication tool.  I've ripped more homegrown tools out than I can recall (after the consultant of staffer that build them left or was fired...)  Real-time, or near-real-time replication is tricky.  Much more so in a cross-platform environment.  Companies that build these tools have years (or decades) of experience dealing with locking, security, data compression, system performance, data type conversion, conflict resolution, and more - all built into the tools.

Thing about it from your organization's perspective:  who would you prefer to have build a mission critical data replication tool - a contractor or employee who has never built one before, or a company that specializes in data replication, and has dealt with all of the attendant complexities for many years.

2) Requirements change.

Applications tend to grow more complex over time.  Replication needs tend to get more complicated, not less, and to incorporate more and more systems as time goes on.  Build your own, and you run the risk of creating a tool that requires constant maintenance and support.

3) DBMSs change.

Vendors that make replication tools have to stay on top of the latest DBMS versions and changes, and are in competition to support more platforms, the latest DBMS versions, and to improve performance to remain competitive.

4) Support

In my experience, even the best data replication tools occasionally encounter problems.   Whose shoulders do you want fixing those problems to be on - your vendor, or you?

- Gary Patterson
0
 
LVL 34

Expert Comment

by:Gary Patterson
ID: 39608853
Mark_willis covered the complexities of the decision-making process nicely.

Over the years, I've evaluated and implemented a data replication strategies (including homegrown solutions), in both homogenous and heterogeneous environments.  Various projects required different strategies, depending on platforms, skill sets, existing tools, and the reasons for replication (high availability, data warehousing/reporting server implementation, synchronization of data between independent business units, performance and load balancing, or some combination of the above.)

I am primarily a DB2 on IBM i specialist, so I tend to be more aware of products that support that platform.  Personally, I like Vision Solutions products.  I've used other Vision replication products (Mimix, OMS/ODS, iTera) in very large environments with great success (I have no affiliation with Vision whatsoever).

http://www.visionsolutions.com/products/Systems-Data-Management-dtshare.aspx

That said, there are a lot of products in this space, and cost varies widely.  

If you decide to purchase a replication product, I encourage you to get quotes from multiple vendors, and negotiate, negotiate, negotiate - or hire a consultant that is familiar with evaluating and purchasing these products to do it for you.  

The cost savings often can pay for the expert help.

- Gary Patterson
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39610928
Gary_The_IT_Pro

http://www.visionsolutions.com/products/Systems-Data-Management-dtshare.aspx

This one only export from DB2 to MS SQL ?
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 39611088
Well, it looks pretty good, and the whitepaper shows Teradata, Oracle, Sybase, as well as SQL Server. Cannot seem to find a complete list of supported DB's though...
0
 
LVL 34

Expert Comment

by:Gary Patterson
ID: 39611680
No idea if this particular product is a potential fit for you or not.  

I'm just pointing this type of product out as a possible solution for your technical challenge.  

There are a lot of these types of products out there - I'd need to know a lot more about your specific environment, applications, budget, and requirements before I'd even think about making specific product recommendations.

I'm sure a quick email or phone call to Vision (or other vendors) will get you all the specific product answers you could want.
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 39612898
Oops. Started replying this morning and got sidetracked.

Something to STRONGLY consider is that all the experts here have had the same recommendation for you:

Get all offices onto the same database and use native replication.

Yeah, all of the same DB, but they already using their own DB, they don't prefer change. or a lot of $$ is need /ony use MYSQL as it is cheap/free.

1. You are not going to be able to find a reliable AND cheap solution for this. No matter which way you go, you are going to be spending a lot of money, whether you are spending money on salary/labor of employees to fix this problem or spending money on buying a product.

As others have mentioned, there are some established applications that do heterogeneous database replication, like Oracle's GoldenGate:
http://www.oracle.com/technetwork/middleware/goldengate/overview/index.html

...but you are going to spend LOTS of money on any application that can do it correctly and has been developed by experienced people so that it is relatively secure and reliable.

2. There are a LOT of business advantages to switching to a single type of database:
- It will have lower support costs (you don't need to find someone who knows all of the databases or multiple people just for DB support - you can focus on specific DB expertise)
- You'll have more reliable replication.
- You won't have to worry about lossy data transformations due to incompatible data types.
- You'll have lower barriers for introducing any new technology that will be common to all offices.
- Maintenance and management will be overall easier (smaller number of tools with more potential for multi-database administration).

I understand it's not always an option, but if the only limitation is cost and "preference" of the office, then that should not overshadow the long-run advantages. Sometimes you can't help it when you're dealing with legacy apps, but if you're dealing with the same data in multiple database types, then it sounds like it's not a specific legacy application that is holding you back.

Remember, there's no reason you can't just use a single database (e.g. all MySQL) for the specific type of data you're trying to sync up and leave the existing database platforms in place to keep supporting whatever they're supporting today.

3. If cost is a limiting factor for you and you cannot use a single type of database, then your best bet, in my opinion, is to have an experienced developer create secure web service front-ends for each of the offices, and then pass along the data to the front-ends and allow the front-ends to pass the data back to the specific database type. That way, the transport of the data is all common / abstract, and each front-end can be responsible for putting it into its own database the correct way.

Oh, and PDO is a database abstraction extension in PHP.
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39621621
mark_wills,

“And that assumes that networking would support nearly "real time" otherwise you must go to an asynchronous model due to latency and holding all replication paths to ransom based on the slowest connection”

Basically all network in-house should be real time, right?  I don’t see any CISCO network is not real time.

“) Network.
Is there sufficient speed and bandwidth between offices, or, assuming internet speeds are easier to increase, should a cloud based solution be considered”

Network should be no problem as we are the best here.

“2) Is the application suite in each office the same”

This means we have to change application, they are not the same.

“To collect data from all the other databases as a central repository of data (like a data warehouse) for ease of reporting”

In this case, all DB will be seen as the central repository of data.

“Is transactional data (e.g. orders, sales etc) only created an the regional office level and then uploaded to central master - or - does it have to be propagated throughout all DB's."

It has to be propagate throughout all DBs.

“If replication can be "end of day" (ie overnight), then you may need to consider HA and DR during the day, but, then the replication part can be batched. That is when the datetime approach can work."

In DR and HA, the target DB should be not accessible, not applicable in this case, right?

“The downside of that is your app needs to understand that delete requested items needs to be bypassed when displaying  data”

Very hard, any logic code can be reference?

“And really, only works properly when you control the application space”

Change application any way, right? Seems mandatory!

“talk to your current DB vendors and application suppliers and see what they would support or recommend (and if the app suppliers allow changes to their DB's).”

Yeah, one of the option.
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39637850
Mark_wills and other expertist, I have some MySQL quetsion need your help, please help:

  http://www.experts-exchange.com/Database/MySQL/Q_28277011.html
    http://www.experts-exchange.com/Database/MySQL/Q_28279742.html
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39846879
I've requested that this question be closed as follows:

Accepted answer: 168 points for mark_wills's comment #a39608567
Assisted answer: 166 points for Gary_The_IT_Pro's comment #a39608853
Assisted answer: 166 points for gr8gonzo's comment #a39612898

for the following reason:

This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39846876
let me award them first.
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39846881
sorry, this task is accessing by a third party vendor. Thanks for you all's input.
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
JDeveloper 12c for 32 bit 4 35
How to find duplicates in SQL Server 3 22
SQL Server memory Issue 7 76
select over clause 1 0
Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

26 Experts available now in Live!

Get 1:1 Help Now