Large Volume Replication

Due to Symantec dropping CPS feature in their BackupExec product, I’m forced to look into another Real-time replication solution.  This is file/folder data only, we are not concerned with application or database integration.

We are looking at possibly virtualizing the file servers and having the disks as Physical RDM’s and/or potentially breaking the volumes into smaller chunks.  Our avg. daily (8hr workday) data changes (whole file, not block/bit) is ~150GB (~50GB on the Critical 4TB volume and ~100GB on the Important 4TB volume).  Our storage is either 8 or 4Gbps Fiber and the offsite we have a 1Gbps connector (which I can up if needed).

I really am wanting a one-way sync to both a local and offsite volume, sync'ing changes in real-time along with NTFS ACL's.  This way if server goes down I can redirect DFS link to second server that host onsite replica.  If onsite storage goes down (i.e. both primary and onsite replica not available), I can redirect DFS to the offsite server hosting its replica... this way there is very little down time.

I'm looking into DFSR on 2K8R2 and DPM, as they would be the cheapest solution, but off the bat, DPM looks to 'hide' its replica and not really intended to reshare although great for snapshots and end-user recover via VSS Previous Version option.  DFSR looks to be a solution, but the 8TB total and 8million file limit concerns me, particularly with future expansion.  Of our 8TB, roughly 6TB is in use.  We also have a few smaller 500GB -1TB volumes which are about 80% full.  Both seem to require pretty heavy storage at the replicated site and there seems to be a pretty heavy CPU/Memory requirement, I'm assuming for the indexing and delta change calc's.

Does anyone have experience with a similar setup and DFSR/DPM or using a 3rd party product like Double-Take, FileReplication Pro, PeerSync, etc.  Major pitfalls with my scenario.  I'm in the process of setting up test network, but working with 2 - 3TB data chunks is long and laborous, so I'm looking for pitfalls, 'don't go there' or hey this worked for us so I spend as little time spinning or re-inventing the wheels as possible.

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

My experience running Double-Take on a 32bit VM under Microsoft Virtual Server wasn't too good for large volumes. The fundamental problem is that if the volumes get out of sync for whatever reason, Double-Take needs to read everything and calculate the checksums to compare them with the replication partner. My guess is that it can easily take over a day to read in the 6 TB of data and do the checks. I have been using DFS-R, but I have had some problems with it where the replication somehow gets lost. I am in the process of deleting a replication group with hundreds of GB of data and several shares, and starting over with it and I will break it up into smaller groups.

My suggestion is to break up your single file server into multiple VMs. For local redundancy, you can use Windows failover clustering, and then use DFS-R to replicate to the remote servers. By breaking up the amount of data you have in each replication set, you make it easier to get directories synced, and backed up. My file server has 2.3 TB on it, and I should probably look at taking some of that and stick it on a second server.

Using DFS namespace to point to another server works well. I use that when branch servers fail, and for scheduled maintenance on my main file server. I don't backup branch servers...I just use DFS-R to replicate files back to my main file server.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jirikiAuthor Commented:
I just read a few posts clarifying that MS doesn't recommend a total of 1TB of replicated data per server.  Hard limit of 8TB is based on the Jet Database limits, but they only have tested or support up-to 1TB.

So do replicate successfully all 2.3TB of your current data and is it in a single volume or spread out multiple? If the latter, whats the largest volume?

We also had one of our central IT personnel mention that MS is going to depricate DFS-R which is not encouraging if true, I'm asking for sourcing on that :(
None of my volumes are 1 TB, my largest is 905 GB. To say that I get it all replicated it a little bit of a stretch. I have a replication group that isn't healthy, and I am going to break it up into smaller groups, but I haven't done that yet. I do have multiple replication groups on a volume, and some groups include multiple volumes (at least for now, probably not after I break it up).

I have heard nothing about depreciating DFS-R, and I doubt that it is true, since Microsoft is trying to get people to move from FRS to DFS-R for SYSVOL replication. To ditch DFS-R and FRS would require all DCs to be Windows Server 8+. I have not heard about any replacements to DFS-R, though there could be something in Windows Server 8, since it has capabilities of replicating large files like VMs.
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

jirikiAuthor Commented:
Thanks for the details, helpful.  Gives me info to go on when I finally am able to get some large scale testing done once my storage gets in.

I also heard back from the ViceVersa rep and they are concerned that the amount of data and daily changes I have are too much for their product; backing up what you intially mentioned as your experience.

Regarding DFS-R deprication, I've confirmed that the central IT person didn't know wth they were talking about and had confused the FRS with DFS-R as one in the same; breeding much confidence on everything else stated there since it was a meeting we specifically had requested their expertise on for the subject.
With 1 Gbps WAN link, you should have plenty of bandwidth for replication, especially considering that it the link speed on my servers and SAN...heck, I can move over 100 GB overnight on an 8 Mb connection with Riverbed.

Can your storage do replication?

If your storage is fast enough, Double-Take may work for you. As I said, rate of change shouldn't be a problem at all. The challenge is checking all of the data after a reboot of the primary server.
jirikiAuthor Commented:
Yeah storage can, but price tag... Currently seeing if worth it to be a sunk cost as its tied to that specific hw (dell md3600f) vs a hw agnostic, software solution to invest in which is my long-term cost savings thinking; but our volumes sizes may prevent it.
Can you break up your data sets to multiple VMs? That might increase your replication licensing cost a little, but it makes replication a lot easier because instead of trying to read and compare 800 GB before things are in sync, you might only need to read in 200 GB if you rebooted just a single VM, which is a lot more manageable. When I bought Double-Take 5 years ago, the price for 5 virtual licenses was about the price of a single physical server running Windows Enterprise. You can tie all of your main and DR servers together via DFS namespaces.
jirikiAuthor Commented:
FYI, after finding a few posts referencing 10TB, adding that to Google search started to pull up a lot more info.  It seems that almost all the previous limitations are still in effect, but if both source and target DFS-R servers are 2008R2, then the total 'supported' replicated
 data per server is 10TB, and that is a supported limit, not technical.

The book Data Protection For Virtual Data Centers by Jason Buffington seems to have some good info on both DFS-R, DPM, and other tech, but with some added real-world tips, raising concerns like replication cache and connection limits over volume size.

I also found a post on a Taiwan Technet by DFSR product team, stating that the 10TB limit was based on MS storage team testing on a random generated volume (I.e. no compression).  Actually 4.4m files on 798 directories and 9.97 TB on a local LAN, although no topology or disk info given.  Took ~ 8.5 days to replicate, but no info given on rate of data change or if this was a static volume.

A few other posts mention 12TB in use and one at 26TB, but again no specific details or references.

I'm leaving off with the primary software candidates being DFSR, SureSync premium package, PeerSync w/ options, and ViceVersa with VVengine add-on.  Since we are using DELL 3600 PV, I'm also looking into the add-on features it offers, but would rather keep it hw agnostic if possible.   Other software I've discounted per lack of confidence on communication with reps and tech engineers... Either sales pitch w/ no backing or apprehension on their part when data size specifics are delved into.
jirikiAuthor Commented:
Happened across my own post here and thought I would update where we actually went.  We ended up going with CA's Replication software.  We are syncing ~15TB spread across 4 volumes, 2 volumes twice... All 4 are being replicated in Realtime, split across 2 VM's each mounting respective volumes as RAW.  We then replicate one servers volume on a scheduled basis over night.

The software is functioning adequately and meeting our needs.

We backed off of DPM as the information on setup and performance was a bit sketchy (at the time) and I had concerns about server upgrades and cross breeding of versions over time.  Have a separate product handle this (CA) makes me feel better about future growth and management.  The downside to CA RA is that it has NO version capability like DPM or CPS had, it is block level, but does not keep 'snapshots' of those blocks with any method to peruse them by the user or the admin.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Server OS

From novice to tech pro — start learning today.