Solved

Large Volume Replication

Posted on 2012-04-05
9
1,885 Views
Last Modified: 2014-10-06
Due to Symantec dropping CPS feature in their BackupExec product, I’m forced to look into another Real-time replication solution.  This is file/folder data only, we are not concerned with application or database integration.

We are looking at possibly virtualizing the file servers and having the disks as Physical RDM’s and/or potentially breaking the volumes into smaller chunks.  Our avg. daily (8hr workday) data changes (whole file, not block/bit) is ~150GB (~50GB on the Critical 4TB volume and ~100GB on the Important 4TB volume).  Our storage is either 8 or 4Gbps Fiber and the offsite we have a 1Gbps connector (which I can up if needed).

I really am wanting a one-way sync to both a local and offsite volume, sync'ing changes in real-time along with NTFS ACL's.  This way if server goes down I can redirect DFS link to second server that host onsite replica.  If onsite storage goes down (i.e. both primary and onsite replica not available), I can redirect DFS to the offsite server hosting its replica... this way there is very little down time.

I'm looking into DFSR on 2K8R2 and DPM, as they would be the cheapest solution, but off the bat, DPM looks to 'hide' its replica and not really intended to reshare although great for snapshots and end-user recover via VSS Previous Version option.  DFSR looks to be a solution, but the 8TB total and 8million file limit concerns me, particularly with future expansion.  Of our 8TB, roughly 6TB is in use.  We also have a few smaller 500GB -1TB volumes which are about 80% full.  Both seem to require pretty heavy storage at the replicated site and there seems to be a pretty heavy CPU/Memory requirement, I'm assuming for the indexing and delta change calc's.

Does anyone have experience with a similar setup and DFSR/DPM or using a 3rd party product like Double-Take, FileReplication Pro, PeerSync, etc.  Major pitfalls with my scenario.  I'm in the process of setting up test network, but working with 2 - 3TB data chunks is long and laborous, so I'm looking for pitfalls, 'don't go there' or hey this worked for us so I spend as little time spinning or re-inventing the wheels as possible.

Thanks,
0
Comment
Question by:jiriki
  • 5
  • 4
9 Comments
 
LVL 42

Accepted Solution

by:
kevinhsieh earned 500 total points
ID: 37817103
My experience running Double-Take on a 32bit VM under Microsoft Virtual Server wasn't too good for large volumes. The fundamental problem is that if the volumes get out of sync for whatever reason, Double-Take needs to read everything and calculate the checksums to compare them with the replication partner. My guess is that it can easily take over a day to read in the 6 TB of data and do the checks. I have been using DFS-R, but I have had some problems with it where the replication somehow gets lost. I am in the process of deleting a replication group with hundreds of GB of data and several shares, and starting over with it and I will break it up into smaller groups.

My suggestion is to break up your single file server into multiple VMs. For local redundancy, you can use Windows failover clustering, and then use DFS-R to replicate to the remote servers. By breaking up the amount of data you have in each replication set, you make it easier to get directories synced, and backed up. My file server has 2.3 TB on it, and I should probably look at taking some of that and stick it on a second server.

Using DFS namespace to point to another server works well. I use that when branch servers fail, and for scheduled maintenance on my main file server. I don't backup branch servers...I just use DFS-R to replicate files back to my main file server.
0
 

Author Comment

by:jiriki
ID: 37840335
I just read a few posts clarifying that MS doesn't recommend a total of 1TB of replicated data per server.  Hard limit of 8TB is based on the Jet Database limits, but they only have tested or support up-to 1TB.

So do replicate successfully all 2.3TB of your current data and is it in a single volume or spread out multiple? If the latter, whats the largest volume?

We also had one of our central IT personnel mention that MS is going to depricate DFS-R which is not encouraging if true, I'm asking for sourcing on that :(
0
 
LVL 42

Expert Comment

by:kevinhsieh
ID: 37840364
None of my volumes are 1 TB, my largest is 905 GB. To say that I get it all replicated it a little bit of a stretch. I have a replication group that isn't healthy, and I am going to break it up into smaller groups, but I haven't done that yet. I do have multiple replication groups on a volume, and some groups include multiple volumes (at least for now, probably not after I break it up).

I have heard nothing about depreciating DFS-R, and I doubt that it is true, since Microsoft is trying to get people to move from FRS to DFS-R for SYSVOL replication. To ditch DFS-R and FRS would require all DCs to be Windows Server 8+. I have not heard about any replacements to DFS-R, though there could be something in Windows Server 8, since it has capabilities of replicating large files like VMs.
0
 

Author Comment

by:jiriki
ID: 37840719
Thanks for the details, helpful.  Gives me info to go on when I finally am able to get some large scale testing done once my storage gets in.

I also heard back from the ViceVersa rep and they are concerned that the amount of data and daily changes I have are too much for their product; backing up what you intially mentioned as your experience.

Regarding DFS-R deprication, I've confirmed that the central IT person didn't know wth they were talking about and had confused the FRS with DFS-R as one in the same; breeding much confidence on everything else stated there since it was a meeting we specifically had requested their expertise on for the subject.
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 42

Expert Comment

by:kevinhsieh
ID: 37841682
With 1 Gbps WAN link, you should have plenty of bandwidth for replication, especially considering that it the link speed on my servers and SAN...heck, I can move over 100 GB overnight on an 8 Mb connection with Riverbed.

Can your storage do replication?

If your storage is fast enough, Double-Take may work for you. As I said, rate of change shouldn't be a problem at all. The challenge is checking all of the data after a reboot of the primary server.
0
 

Author Comment

by:jiriki
ID: 37844918
Yeah storage can, but price tag... Currently seeing if worth it to be a sunk cost as its tied to that specific hw (dell md3600f) vs a hw agnostic, software solution to invest in which is my long-term cost savings thinking; but our volumes sizes may prevent it.
0
 
LVL 42

Expert Comment

by:kevinhsieh
ID: 37844937
Can you break up your data sets to multiple VMs? That might increase your replication licensing cost a little, but it makes replication a lot easier because instead of trying to read and compare 800 GB before things are in sync, you might only need to read in 200 GB if you rebooted just a single VM, which is a lot more manageable. When I bought Double-Take 5 years ago, the price for 5 virtual licenses was about the price of a single physical server running Windows Enterprise. You can tie all of your main and DR servers together via DFS namespaces.
0
 

Author Comment

by:jiriki
ID: 37854264
FYI, after finding a few posts referencing 10TB, adding that to Google search started to pull up a lot more info.  It seems that almost all the previous limitations are still in effect, but if both source and target DFS-R servers are 2008R2, then the total 'supported' replicated
 data per server is 10TB, and that is a supported limit, not technical.

The book Data Protection For Virtual Data Centers by Jason Buffington seems to have some good info on both DFS-R, DPM, and other tech, but with some added real-world tips, raising concerns like replication cache and connection limits over volume size.

I also found a post on a Taiwan Technet by DFSR product team, stating that the 10TB limit was based on MS storage team testing on a random generated volume (I.e. no compression).  Actually 4.4m files on 798 directories and 9.97 TB on a local LAN, although no topology or disk info given.  Took ~ 8.5 days to replicate, but no info given on rate of data change or if this was a static volume.

A few other posts mention 12TB in use and one at 26TB, but again no specific details or references.

I'm leaving off with the primary software candidates being DFSR, SureSync premium package, PeerSync w/ options, and ViceVersa with VVengine add-on.  Since we are using DELL 3600 PV, I'm also looking into the add-on features it offers, but would rather keep it hw agnostic if possible.   Other software I've discounted per lack of confidence on communication with reps and tech engineers... Either sales pitch w/ no backing or apprehension on their part when data size specifics are delved into.
0
 

Author Comment

by:jiriki
ID: 40364393
Happened across my own post here and thought I would update where we actually went.  We ended up going with CA's Replication software.  We are syncing ~15TB spread across 4 volumes, 2 volumes twice... All 4 are being replicated in Realtime, split across 2 VM's each mounting respective volumes as RAW.  We then replicate one servers volume on a scheduled basis over night.

The software is functioning adequately and meeting our needs.

We backed off of DPM as the information on setup and performance was a bit sketchy (at the time) and I had concerns about server upgrades and cross breeding of versions over time.  Have a separate product handle this (CA) makes me feel better about future growth and management.  The downside to CA RA is that it has NO version capability like DPM or CPS had, it is block level, but does not keep 'snapshots' of those blocks with any method to peruse them by the user or the admin.
0

Featured Post

Do email signature updates give you a headache?

Do you feel like all of your time is spent managing email signatures? Too busy to visit every user’s desk to make updates? Want high-quality HTML signatures on all devices, including on mobiles and Macs? Then, let Exclaimer solve all your email signature problems today!

Join & Write a Comment

Workplace bullying has increased with the use of email and social media. Retain evidence of this with email archiving to protect your employees.
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
This tutorial will show how to configure a new Backup Exec 2012 server and move an existing database to that server with the use of the BEUtility. Install Backup Exec 2012 on the new server and apply all of the latest hotfixes and service packs. The…
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now