Link to home
Start Free TrialLog in
Avatar of jiriki
jiriki

asked on

Large Volume Replication

Due to Symantec dropping CPS feature in their BackupExec product, I’m forced to look into another Real-time replication solution.  This is file/folder data only, we are not concerned with application or database integration.

We are looking at possibly virtualizing the file servers and having the disks as Physical RDM’s and/or potentially breaking the volumes into smaller chunks.  Our avg. daily (8hr workday) data changes (whole file, not block/bit) is ~150GB (~50GB on the Critical 4TB volume and ~100GB on the Important 4TB volume).  Our storage is either 8 or 4Gbps Fiber and the offsite we have a 1Gbps connector (which I can up if needed).

I really am wanting a one-way sync to both a local and offsite volume, sync'ing changes in real-time along with NTFS ACL's.  This way if server goes down I can redirect DFS link to second server that host onsite replica.  If onsite storage goes down (i.e. both primary and onsite replica not available), I can redirect DFS to the offsite server hosting its replica... this way there is very little down time.

I'm looking into DFSR on 2K8R2 and DPM, as they would be the cheapest solution, but off the bat, DPM looks to 'hide' its replica and not really intended to reshare although great for snapshots and end-user recover via VSS Previous Version option.  DFSR looks to be a solution, but the 8TB total and 8million file limit concerns me, particularly with future expansion.  Of our 8TB, roughly 6TB is in use.  We also have a few smaller 500GB -1TB volumes which are about 80% full.  Both seem to require pretty heavy storage at the replicated site and there seems to be a pretty heavy CPU/Memory requirement, I'm assuming for the indexing and delta change calc's.

Does anyone have experience with a similar setup and DFSR/DPM or using a 3rd party product like Double-Take, FileReplication Pro, PeerSync, etc.  Major pitfalls with my scenario.  I'm in the process of setting up test network, but working with 2 - 3TB data chunks is long and laborous, so I'm looking for pitfalls, 'don't go there' or hey this worked for us so I spend as little time spinning or re-inventing the wheels as possible.

Thanks,
ASKER CERTIFIED SOLUTION
Avatar of kevinhsieh
kevinhsieh
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of jiriki
jiriki

ASKER

I just read a few posts clarifying that MS doesn't recommend a total of 1TB of replicated data per server.  Hard limit of 8TB is based on the Jet Database limits, but they only have tested or support up-to 1TB.

So do replicate successfully all 2.3TB of your current data and is it in a single volume or spread out multiple? If the latter, whats the largest volume?

We also had one of our central IT personnel mention that MS is going to depricate DFS-R which is not encouraging if true, I'm asking for sourcing on that :(
None of my volumes are 1 TB, my largest is 905 GB. To say that I get it all replicated it a little bit of a stretch. I have a replication group that isn't healthy, and I am going to break it up into smaller groups, but I haven't done that yet. I do have multiple replication groups on a volume, and some groups include multiple volumes (at least for now, probably not after I break it up).

I have heard nothing about depreciating DFS-R, and I doubt that it is true, since Microsoft is trying to get people to move from FRS to DFS-R for SYSVOL replication. To ditch DFS-R and FRS would require all DCs to be Windows Server 8+. I have not heard about any replacements to DFS-R, though there could be something in Windows Server 8, since it has capabilities of replicating large files like VMs.
Avatar of jiriki

ASKER

Thanks for the details, helpful.  Gives me info to go on when I finally am able to get some large scale testing done once my storage gets in.

I also heard back from the ViceVersa rep and they are concerned that the amount of data and daily changes I have are too much for their product; backing up what you intially mentioned as your experience.

Regarding DFS-R deprication, I've confirmed that the central IT person didn't know wth they were talking about and had confused the FRS with DFS-R as one in the same; breeding much confidence on everything else stated there since it was a meeting we specifically had requested their expertise on for the subject.
With 1 Gbps WAN link, you should have plenty of bandwidth for replication, especially considering that it the link speed on my servers and SAN...heck, I can move over 100 GB overnight on an 8 Mb connection with Riverbed.

Can your storage do replication?

If your storage is fast enough, Double-Take may work for you. As I said, rate of change shouldn't be a problem at all. The challenge is checking all of the data after a reboot of the primary server.
Avatar of jiriki

ASKER

Yeah storage can, but price tag... Currently seeing if worth it to be a sunk cost as its tied to that specific hw (dell md3600f) vs a hw agnostic, software solution to invest in which is my long-term cost savings thinking; but our volumes sizes may prevent it.
Can you break up your data sets to multiple VMs? That might increase your replication licensing cost a little, but it makes replication a lot easier because instead of trying to read and compare 800 GB before things are in sync, you might only need to read in 200 GB if you rebooted just a single VM, which is a lot more manageable. When I bought Double-Take 5 years ago, the price for 5 virtual licenses was about the price of a single physical server running Windows Enterprise. You can tie all of your main and DR servers together via DFS namespaces.
Avatar of jiriki

ASKER

FYI, after finding a few posts referencing 10TB, adding that to Google search started to pull up a lot more info.  It seems that almost all the previous limitations are still in effect, but if both source and target DFS-R servers are 2008R2, then the total 'supported' replicated
 data per server is 10TB, and that is a supported limit, not technical.

The book Data Protection For Virtual Data Centers by Jason Buffington seems to have some good info on both DFS-R, DPM, and other tech, but with some added real-world tips, raising concerns like replication cache and connection limits over volume size.

I also found a post on a Taiwan Technet by DFSR product team, stating that the 10TB limit was based on MS storage team testing on a random generated volume (I.e. no compression).  Actually 4.4m files on 798 directories and 9.97 TB on a local LAN, although no topology or disk info given.  Took ~ 8.5 days to replicate, but no info given on rate of data change or if this was a static volume.

A few other posts mention 12TB in use and one at 26TB, but again no specific details or references.

I'm leaving off with the primary software candidates being DFSR, SureSync premium package, PeerSync w/ options, and ViceVersa with VVengine add-on.  Since we are using DELL 3600 PV, I'm also looking into the add-on features it offers, but would rather keep it hw agnostic if possible.   Other software I've discounted per lack of confidence on communication with reps and tech engineers... Either sales pitch w/ no backing or apprehension on their part when data size specifics are delved into.
Avatar of jiriki

ASKER

Happened across my own post here and thought I would update where we actually went.  We ended up going with CA's Replication software.  We are syncing ~15TB spread across 4 volumes, 2 volumes twice... All 4 are being replicated in Realtime, split across 2 VM's each mounting respective volumes as RAW.  We then replicate one servers volume on a scheduled basis over night.

The software is functioning adequately and meeting our needs.

We backed off of DPM as the information on setup and performance was a bit sketchy (at the time) and I had concerns about server upgrades and cross breeding of versions over time.  Have a separate product handle this (CA) makes me feel better about future growth and management.  The downside to CA RA is that it has NO version capability like DPM or CPS had, it is block level, but does not keep 'snapshots' of those blocks with any method to peruse them by the user or the admin.