Link to home
Start Free TrialLog in
Avatar of cgeorgeisaac
cgeorgeisaacFlag for United States of America

asked on

Best technology to keep two Windows Servers in Sync

We have 2 mediaservers (with video, media files, unstructured data). The server has about 100 TB of data.   This data is spread in about 15 Volumes.   We would like to keep both these Servers in Sync when new data is written, downloaded or uploaded.  What is the best technology to enable me to keep both these Windows 2019 servers synced in a VMware environment.   Will appreciate the experts advice. 
Avatar of strivoli
strivoli
Flag of Italy image

I would consider the "official" way: DFS Replication
Avatar of cgeorgeisaac

ASKER

thank you for your prompt response strivoli.    I had tested DFS-R earlier in my test lab and found files that are open/ not saved does not get replicated and at times generate errors.  If there is no better technology I guess I will have to fall back on DFS-R.
Better than that there's only Storage Replica which works at a lower level than OS level. Basically two Storage are synched despite the OS you have on top.
It depends a bit from the change volume. 100 TB sounds like a storage device like NetApp or something similar.
And ist sound a bit like a film, tv or broadcasting studio.
So you have to be aware about the amount of data, which is changed per day.
The fastest way is usually to mirror it directly on the storage. As far as you have a network adapter involved, you have the first limitation. An 1 GBit adapter has a net transfer volume about 75 MByte / sec. under the precondition, that there is no other traffic. A traditions harddisc arround 120 MB/sec, SSD around 500 MB /s. 
So the second question would be, what is the current load (without the media servers) what gives you a clue, how much bandwidth you need for daily business.
Second question, what is the major idea behind the "Sync" question, i.e. redundancy, data security, service security  etc.

It is usual to splitt it up into two parts, the service, which makes your data available to users, and the data storage itself.
They are usually both redundant, means you have redundant server as well as redundant storage. So whenever one part of the system is failing, the other part can take over the service.

Modern storage systems are able to mirror their own content via a dedicated network adapter. But this also means, double costs for the storage. This technique is also used for low level backup. 
You can also mirror the content inside one single storage system with two spindles, but this is not completely redundant. If the storage system has a problem, the service is down.
Direct Mirroring also has also the lack, that all errors on the source will be replicated to the target.

I guess you may think about an intelligent storage device / system like NetApp or similar. They are constructed for a lot of use cases and they can do a lot and they do it fast.
Your VMWare is more or less out of scope or only responsible for the servers, which offers the service.
Synchronizing the data via the VMWare server may affect all your other services as well.    


 
Your VMWare is more or less out of scope or only responsible for the servers, which offers the service.
Synchronizing the data via the VMWare server may affect all your other services as well.    
+1
You mention, "thank you for your prompt response strivoli.    I had tested DFS-R earlier in my test lab and found files that are open/ not saved does not get replicated and at times generate errors.  If there is no better technology I guess I will have to fall back on DFS-R."

This is how backups work.

Anytime part of data lives in memory buffers + part on disk, only the disk part can be synced.

The way I do this is using rsync (because rsync is fast + works).

1) Do an rsync of disk.

2) Temporarily stop all services using memory buffers, which will mainly include file services + SQL services.

3) Do a 2nd rsync of disk.

4) Restart all services stopped in #2.

Since rsync transfers file deltas for changed files, the 1st rsync might take a long while to run, while the 2nd rsync usually completes in only a few seconds.

5) When I stop/pause services, I usually rewrite MOD (message of the day) or similar message files, to notify users system is undergoing maintenance.

Since this message will only occur during a few seconds of outage, building in user notification of outages might be overkill in your situation.
Many thanks and great indepth well explained advices strivoli, Bembi, David Favor and Ste5an.
Let me explain what I intend to do.

Currently I have Only one server which has Both the IIS Web Application Server (200GB) PLUS MEDIA STORAGE (100+ TB in 15 Volumes).  We are using Nimble HF60. We  have Fibre Channel Connection with excellent bandwidth.

The PLAN: Separate the  IIS Web Application and the Media Storage for Full 100 percent FT/HA/Redundancy. Priority is to Focus on FT.

Create FrontEND:   3 Windows 2019 Servers - To Run only  Windows IIS/Web Application Servers .  This servers has only 200 GB each as mentioned earlier.
Question:  Is there any technology that could be used to enable users to be directed to any of these servers and at the same time have the Servers in sync.  
If there is no Load Balancer Technology from windows, may introduce an External Load Balancer eg. F5 or Fortinet  to Create a VIP and direct the users to this Front End Pool. .

 After the user is connected, they need to be directed to the BACKEND POOL (comprising of 2 MediaStorage as explained below)
Can I use DFS-N to point the users to the Folder Link and Folder Targets? Or Can the LoadBalancer do this for me?

BackEND:   2 MediaStorage Servers to Storage 100 TB of Media Files. Currently I have Only1 MediaStroage.
                    The Storage used is Nimble HF60
                     The 100 TB Storage is sitting on about 15 Volumes
                     14 Volumes are READ ONLY Data.
                      1 Volume (about 10TB) is the only Volume that will be writing data.

Reason: For  Fault Tolerance (not HA):  I need continuous availability.  Both the Storage servers should be in ACTIVE/ACTIVE mode.  Just in case one goes off line; the users/Data Deltas are seamlessly migrated to the other one.

Question: Is there any technology that can help me to keep both these MediaStorage in SYNC.  I am willing to use any Storage or any other technology on Premise.      (Will be introducing Azure File Sync - later)

Thankyou. 
 


                   
                
Hello,
If you have more than one front end server, you need of couse a load balancer. You can use Microsoft NLB but Fortinet or cisco F5 would be the much better choice, especially if you already have it.
To keep them equal, I guess its more a patching sequence question. They can work side by side.

I do not really see the need of backend servers (with the exception of DFS) as you can attach the storage directly to your frontends.

To make it completely redundant, you may need a second HF 60 and possibly two (to be redundant) storage controllers if the HF 60 controllers cannot handle the redundancy. 
There is not really written a lot about the HF 60 Device on the HP page (more advertising) but what I can see is, that all components (HW / SW) inside the HF60 are redundand, there is a build in storage controller and the file system is triple drive parity + intra drive parity, which allows three disc failures at the same time.
Even there is an internal backup feature which may be used to mirror the data.
So, possibly the current HP 60  can already fullfill your needs with the exception of geo clustering.
But this question I would address directly to HP.

Rather than to setup two additional backup server (which act more or less as storage controllers) and use windows functionality (like DFS), you should get aware about what is build in into you HF 60 device. It doesn't make sense to me to put a storage controller in front of an existing storage controller. It is just another point of failure. 

If it is the case what HP writes, then the question is, how many money you spend to raise the security level some points behind the comma. Maybe a second backup option and store the backup at a different location. For the case of fire etc.


Let me add one point, as  you asked for.
If you attach your storage directly to several front ends, it doesn't care which frontend the users are using as they see the same storage.
This is why I do not see the advantage to use DFS. You read and write to the same storage, this can be done directly by the HF 60.
DFS is more used to keep two different file systems in sync, mainly installed inside the servers. But you have only one storage system.  
Excellent indepth insight Bembi - Many thanks.
 
You do have a valid point and  question - why am I separating IIS/Webapplication  from the Storage?
Answer:The reason is to  minimize the amount of failure points.  Should I have both the IIS + Storage built into one Server, if there is a problem with the IIS/Webapplication users will not be able to use the Storage part too.
Also, I have provision to create only 2 Storage Volumes i.e. 100TB x 2;   NOT  100TB x 3.
But technically, Yes both the servers are connecting to the Same Volume!

Based on your advice I did contact HPE and they did  offer a Snapshot Replication & Peer Persistence/Synchronous Replication option at the volume level/Block level (NOT FILE LEVEL);  but as you rightly pointed out  it does require 2 Nimble HF60s.  At the same time it is possible we may purchase another HF60 in future  and deploy it in another local DataCenter closeby and have synchronous Replication or Peer Replication with full redundancy.

However, in my case I want both the Storage Servers in Sync continuously with 100 per cent Faut Tolerance today.
I hope this makes sense. May be I am mistaken.


Since HPE does not provide this type of solution, I am forced to think that  DFS-R or Storage Replica may be better option at File Level (not Block Level)  just to keep the Storage in sync and also provide complete Fault Tolerance for users. Storage Replica I did test it out and it does not have a very user friendly interface like the DFS-R.

Thanks again. 

Hello,
just a picture...
User generated image1.) The first picture is a classic view onto a full redundant service / storage system, which can even be geo redundant.
Every server, every device is connected to every other device. The storage controller take over the part to load balance the traffice between the two storage devices and keep them in sync. Some storage devices have also the option to exchange data by another dedicated network connection, i.e to mirror from one storage to another one.
This is the perfomance and redundancy mercedes.  

2.) This is your proposal. In general the right path, but I ask for the advantage.
Your HF 60 device have even a redundant storage controller build in.
At least this is written on the HPE web site, that everything (HW / SW) is redundant.
The question is just, if your device really has it by default, or if you have to pay for, they just descibe the capability.  
So if the HF 60 has a built in redundant storage controller, the only reason why to put additional controllers in front of it ist, that the build in controller is missing functionality, or you want to use really DFS.
Also keep in mind, that the windows storage servers should not be virtualized as otherwise they use at least partly the ressources from your productive network.

3.) This is my proposal, assuming the storage controller is inside the HF 60 and it is redundant. And assuming you can realize what you intend to do using these controller. So my idea is just based on the HP desciption, tah you have rendundand controller, ok, DFS is missing.

For all, the frontends has nothing to do with storage management. In all three scenarios a separate storage controller is used. The frontends are just attached to the storage controller, which is just an iSCSI device or something similar. So the frontends do not really see, what is behind it. It just acts like a harddisk.

Redundacy has two aspects,
- fail over (means half of all devices are dead), = dead investment, you throw them as you bought them.
- real load balancing (both nodes are used and active).

But even if the HF 60 has only an active / passiv failover, it fullfills the needs for availability. As you currently have only one storage device, and the controller is fast enough (what I would assume), it is all you need.
If you have realy redundand storage devices, two separate load balanced storage controllers make sense.

Yes, windows has the more easier user interface, but storge vendors live, because Microsoft let them. They offer basic functionality for the masses, but leave the special cases to the specialist. And these specialist develop solutions, which are sold, because they have the functionality, which Microsoft do not offer.

DFS is the same argumentation. Is was originally constructed for AD sysvol replication. Later MS decided to offer it also for file replication. But...

DFS is file based = overhead = slow, but of course related to the underlying hardware.
DFS works in the background.
DFS is queue based, that means, any change is written to a local queue (disk) and then distributed to the targets.
The queue is also used for maintaining local copies to avoid unneded replication traffic. Means the mechanism to decide if a file needs replication is based on the local queue. If the queue is full, older files are deleted.
Not interesting for new files, as they have to be replicated anyway. But this is the mechanism how it works.
So beside the real storage, you need additional space for the queue, and this is calculated by the largest file you have as well as from the daily change rate.  
 
In my mind a solution for smaller loads an especially smaller files.

Block data transfer needs more space, as it copies just the assigned disk blocks from one side to the other. It is more or less a physically / binary copy mechanism. It reads / writes directly from / to the disk. 
File based transfer is more efficient according the used space, but due to the overhead has not the best perfomance.
The operation system file system API has to be used. 
 
My DFS has a replicated volume of  round about 1,2 TB, where 900 GB is static stuff like pictures or installation files.
A full synchronisation takes several days (5-7). Ok, traditional harddisk in a Raid Array on one side and 1 GBit between them, but even takes much more time than you would expect from the measures. But I use it just as a kind of online backup. So perfomance is not my issue.
Also a possible aspect, the file is always written twice, first into the queue

One word according HP. They are also a generic supplier and most of the stuff they sell has only the label in common with HP. You just pay a little bit more for the "original" spare parts. Some companies are just sticking to HP, but this is why they offer everything.

Summary. I do not want to say that your idea is wrong. I just want to put some additional aspects into your mind.
For example that your idea for the windows storage controllers is mainly DFS driven.
At the end, you have to priorisize all of them for your needs, even this includes the ease of configuration and maintenance.
 
Possibly it is worth to setup a test system with two servers and a DFS replication agreement. Then move some typical files to the first, and see what happens on the second. I mean some of your larger files. If you are happy with the performance, everything is fine, but I would expect something different.

 
Fantastic Excellent Analysis Bembi, Many thanks!
I agree to your view points and basically everything you mentioned. I enjoyed reading that a few times. Very professional approach.Really appreciate for taking the time to put that together.  
However, I just wanted to present my architecture and if I may, would request your views when convenient.
User generated image
Hi,
following your picture, you are reading the data from the storage and write it back to the same storage.
The limitation factor is the network link. With a 10 Gbit link the througput is around 750 MB/s. This is the speed of a spindle with 7 traditional harddisks in an array.  SSDs would not have an advantage as they are much faster than the network link.  
I mean you have anyway two spindels with 7 drives (as you said 15 is max). If SSDs are used, the block level replication can write about 3.500 MB/s, with traditional disks the throughput its more or less the same (750 MB/s).  
What are your concern about the block level replication?
Why do you need triple space?

For the other points, give me some time, just to put them into a logical order. 
 
Many Thanks again Bembi!
True technically,  we are .  writing to the same storage. We have a 40 GB FC pipe. So thankfully,  have the advantage of screaming fast network connections.  The HF60 is SSD.
 
1. In the FrontEnd Pool - 3 servers are provided to give adequate room for users connections with Full FT although technically is falls into the same volume/LUN

2. Concern about Block Level - In my architecture:
    (a) I can Replicate/provide full redundancy only upto the Volume/LUN level from Nimble HF60 (since 2 storage arrays are  required)
    (b) How can I keep the 2 Storage Servers (BackEnd Pool) is Sync at File Leve?   HPE Nimble HF60 does not offer any technology
    (c) Only solution to keep the Folders/Files in StorageServer is using DFS-R or Storage Replica  
    (d) I understand both these Storage Servers are in the same Volume/LUN connecting to the same Storage Controllers (in Failover - Active/Passive mode).   But it just helps to ease the load or balance the load to say the  least and provide full FT just in case one of the Storage Servers go down.

With that being said.  I will seriously consider your solution i.e. #3 in your diagram, Till I get a new HF60 to provide full redundancy.
Will wait for your feedback please, whenever convenient.  
Kind regards.

Have you looked at Windows Strorage Replica though it seems to require datacenter version of the windows 2019 server.
ASKER CERTIFIED SOLUTION
Avatar of Bembi
Bembi
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I am positive, I am closer to a very reliable solution based on your inputs Bembi.  I will revert back after absorbing your valuable thoughts and once again highly appreciate your detailed analysis and feedback. Thank you. 
Bembi:   Thank you so much for your out of the box analysis and thoughts. It was truly helpful.  We are re-architectering the infrastructure and taking a good deal from your valuable advice and thoughts.    I am sorry I took a rather long time to respond becoz I  tested a few other technologies in the interim period.  DFS-N, DFS-R, Storage Replica, Azure File Sync, VMware FT, LB, etc and then had to come to a conclusion for the final decision.  For now will be focussed on the 2 Storage Array (Peer Persistence/Replication/DR) and also  an eye on the Azure Cloud - Azure File Sync.   Planned our CE and we going ahead.  Best wishes.