asked on

Best method for redundancy

We have purchased 6 servers...

3 are for the local network
3 are for a remote segment of the network..(redundant - exact duplicat of the 3 local)

this arrangement is designed for a super redundancy system..

the 3 remote servers are planned to be identical 'live' servers in case the building we are in falls down, fire, etc...

THE question is, What is the best method for creating the redundant servers at the remote location..and also maintaining that exact duplicat state..?

thanks

giltjr

How much data are you talking about? What type of disk subsystem are you planning to use? Hard drives in the server, NAS, or SAN? How far away are the two sites? What is the "data": flat files, or databases? If databases what DBMS are you running?

gbirkemeier

I would look into using DFS (Distributed File System).
http://www.microsoft.com/windowsserver2003/techinfo/overview/dfsfaq.mspx
This will let you change the location of a share in an instant should you need to.

Then, mirror the data to the other servers using a program such as this:
http://www.filereplicationpro.com/

Configure the remote servers as BDCs for authentication backup.

r_naren22atyahoo

There is a difference between the redundancy and synchronization

You secnario looks like synchronization of data...
What exactly are these servers, and what they are going to do???
like Web servers, file servers, DNS servers ertc ???

let us know

regards
Naren

gbirkemeier

Actually DFS will handle replication and rollover in case of a system failure on its own.

scrathcyboy

There is a really GREAT IP trick that you can emply to make this work exactly as you want.
First, do what you are doing -- i.e. make a total clone of one server to the other, or 3 active servers to 3 backup servers -- as long as the hardware is identical from 1-2-3 active to 1-2-3 backup, windows will never know the difference. And keep these updated daily, or whatever interval you want.

Now here is the problem, if the backup servers boot with the same IP domain and network name as the real servers, then windows will alert you that there is an IP and Name conflict on that domain. If you just change the name and these backups are something like AD servers, they will still interfere with the main servers, possibly creating conflicts that can break down the network server setup.

BUT !!!! If you simply put the 1-2-3 backup servers on a different IP domain, even one class C off, then you will almost certainly get no conflict (unless the routers are configured to specifically look across class C IP domains. So here is an example.

Say your main IP domain is 100.10.1.x and all servers are on this domain.
Then set the backup servers to 100.10.2.x (with X being the same numbers as the 1-2-3 active servers).
Now you can run these backup servers with absolutely no interference with the main ones. Then when the main 1-2-3 servers fail, you simply change the IP address on the backups to the 1.x domain, and in less than 30 seconds, wow! They replace the original servers.

Say the main servers are designed to see across class C from 1.x to 128.x -- in that case, you set the backups to be 200.x or something like that (or use the subnet mask) and you are still covered.

This is THE BEST way to have a total duplicate of 3 active servers, ready within 30 seconds notice of being able to replace the original servers. I have not seen a better system developed, where you do not have to change network name, SID, or anything, all you do is put the IP address out of the current IP domain range, and pull it back to the original when you need to do it. It is foolproof !!

ASKER CERTIFIED SOLUTION

DireOrbAnt

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

The--Captain

>Then when the main 1-2-3 servers fail, you simply change the IP address on the backups to the 1.x domain, and in less
>than 30 seconds, wow! They replace the original servers.

30 seconds in a fantasy world - your suggestion completely ignores relevant issues (ie grouchy windows DNS caches, the assumption of an entirely host [rather than IP] based network configuration, etc). Do you actually configure networks professionally? And for whom? I'm seriously curious.

>I have not seen a better system developed

Yes, I'm sure this is the way google does it...

>It is foolproof !!

If you say so.

Cheers,
-Jon

Steve Knight

Hmm, I wonder how you keep a server that can't communicate with the LAN up to date and synchronised to act as a DR server?

For AD controllers, DHCP servers, DNS servers etc. just keep both running and your DR is the fact that the other server is already working, might just need FMSO roles transferring etc.

For file servers you could replicate data using DFS etc. or hardware array solution. In the past I have setup domain dfs root for instance and point then users to shares under that that which point to each server but disable the 'backup' one. In the event of a DR switch swap the DFS settings around. One site I deal with we put in a poor man's DR using a nightly xcopy script from live servers to DR servers of the essential data people would need from a file server such as redirected desktops, Lotus Notes and AS/400 session config files etc. then if the login script could not connect a drive to their home server it would map drives to the DR server to get them working and give them an empty home drive to save data in.

For exchange, Notes etc. you might need to look into clustering or replication of data in real time across the servers. If incoming mail etc. is involved you'd have to deal with potentially changing MX records or your firewall, sometimes it is best off having both your 'live' and 'backup' servers operational and

For webservers with static content just replicate the data periodically and set DNS to round robin between them or put a load balancer in front.

For SQL etc. you can have the live server copy it's log files to the backup server usign a scheduled task every hour or so together with a nightly db backup again copied to the dr server. In DR situation then you can import from the sql backup and replay the log files upto the last hour or so.

Anyway will shut up now. Feel free to respond to any of the comments and give some info.

Steve

pseudocyber

From my handy dandy Cisco "Designing Content Switching Solutions" book, ...

"In order to minimize the downtime of these mission-critical applications, companies are looking for solutions that can gurantee that these applications stay online, regardless of any situation (natural disaster or link, hardware, or software failures). One such solution available to customers today is global server load balancing (GSLB). GSLB can make intelligent decisions by inspecting the IP packet and Domain Name Service (DNS) information and direct traffic to the best-available, least-loaded sites and servers that will provide the fastest and best response."

You asked for the BEST method for redundancy. Using professional network switching gear - capable of making layer 7 and down switching decisions is the BEST method.

We're implementing Cisco CSS 11500's for this purpose http://www.cisco.com/en/US/products/hw/contnetw/ps792/index.html. We looked at several vendors and really liked F5's solution, but went with the Cisco's based on price (When people play in the big leagues, Cisco is actually cheaper than others). We looked at Cisco, Radware, F5, and Foundry.

Hope this helps.

giltjr

There are a few issues that need to be addressed. What are the servers doing? What type of failures are you planning for? Where are the users in relation to the servers (both primary and backup).

Server failure is actually fairly easy to handle. The tough part is data replication. The type of data, volume of data, and how closely you want the data in sync: 1 day behind or "exactly the same", make a big difference. There are software packages, like DireOrbAnt Double take. Some SAN's allow for remote copy of data where they are sending update to the remote box as they do the update.

---> I have not seen a better system developed, ...

IBM z/OS systems running in GDPS and having PPRC can take over "instantly" on failure of a OS, BOX, or SAN. I am sure there are other methods simular. In most systems that are setup for GDPS, 30 seconds is WAY to long to wait for system recovery. Your suggested solution is what I would expect to see in a small Windows shop that did not care about data being out of sync by a day, or hours. I concur with The--Captain, manulating DNS entries is NOT a workable solution, some caching DNS servers will cache results for 24-72 hours no matter what the TTL is for the record. That is a lot longer than 30 seconds.

---> It is foolproof !!

I can't remember who said it, but there is a quote that goes something like:

"When designing foolproof systems , the intelligence of the fool is always underestimated."

I am really curious about what you really do also. You indicate that you do "Big IT, HW / SW, ..." but most of your comments seem to be based on small Windows only enviroments.

fnbgppl

Try making a barebones backup to an external drive that is transported off site each night. This way if something happens you can just install the image over the respective server (as long as same hardware), and it will be an exact duplicate.

FNBGPPL

Glowingdark

EMC® RepliStor® SMB is a software solution to replicate data from one server to another, for redundancy. I do not think it handles automatic failover (anyone miss the days of Novell and SFT?). It isn't that much more than backup software. I haven't used this software myself, but I was researching it when looking at EMC's Retrospect (formerly Dantz). EMC has some experience in storage management I would say.