backup plan


Hello there,

i have a machine on which i have setup my erp application on it.the application uses java,sqlserver 2005 db and the os is sql server 2003. now i want to setup another machine to act as a backup,so that if in case this live machine which is providing service get broken down for some reason,then this second machine should take charge of it.what are the things i need or how can i setup such a backup,please help.

cheers zolf
zolfAsked:
Who is Participating?
 
SelfGovernConnect With a Mentor Commented:
There are several ways to do this, depending on what your needs are for *recovery time*, and *recovery point*.   Recovery time is essentially, "how long can I afford to be down?" and could be anywhere from zero to minutes to days.  Recovery point is "how much data can I afford to lose?" and could be none, only the last transaction, some number of transactions, an hour's worth, a day's worth, etc.

The other question to ask is, "What level of disaster do I have to prepare for?"   Is it enough to plan for the server going down?  Do I have to plan for the server room being under water or losing power?  Is theft an issue?  How will I protect against application corruption, user error, viruses, or malicious users?

Typically, a solution that has the quickest recovery time will be more expensive than one with a slower recovery time.   Typically, a solution with a more recent recovery point will be more expensive than one that allows more data to be lost.

We could fix you up with a solution that would guarantee you'd lose no data and stayed on line even in the case of your building blowing up through a combination of clustering and mirroring... but your 'stuff' might not be that critical -- and you might not be able to afford it.

From simplest and typically cheapest to more complex and expensive, your options will look something like this:

1. Just run a tape backup, store the tapes off-site.  Inexpensive, risks no more than a day of lost data, and some amount of down time (hours to days, depending on how long it takes to get a replacement server). Good against most of the possible disasters listed.

2. Use Acronis or a similar program to copy a restorable image to a second machine.  Pretty inexpensive (especially if the other machine was already in use and is capable of handling the additional load).  This protects against the failure of a single machine, but not theft, natural disaster.  It may or may not protect against viruses, mistakes, and malice, depending on how many generations of images you keep (a virus might do damage for several days before you see its devastating effects; if you only have one day's image to restore from, you're out of luck).  Because of this, a tape backup is still important.  Recovery point is dependent on how often you take images -- the more often the less data lost, but the more space you use, and the more you might affect your system performance.  Recovery time is usually pretty quick -- but TEST to make sure you can do it when you have to (true with any of these solutions!)

3. Use a product like HP's Storage Mirroring that will continuously copy byte level changes from one machine to another.  The second machine may be located a considerable distance away, and can be configured to automatically take over when the first system fails or goes offline.   This solution has a great recovery point, and a very good recovery time, but it does not protect against viruses or malicious acts, since a malicious change or deletion will also be mirrored to the other site.  But it does have the advantage of a high degree of automation, so a tape backup is still important.  http://h18006.www1.hp.com/products/storage/software/sm/index.html?jumpid=reg_R1002_USEN

4. As Discusfish says above, there are some good solutions that can be crafted around VMware or HyperV and similar products.  You can tailor a solution to meet your recovery point and recovery time objectives, with the cost increasing as the downtime and lost transactions decrease. These solutions are not backup, but protect against HW failure of the primary server.

5. To protect against a single machine failure with no or almost no lost transactions, you can use a cluster -- two servers that act as one and either is able to take over for the other.  Microsoft Clustering or HP's Polyserve are two examples.  Like the Storage Mirroring solution, this doesn't protect against a virus or deletion.  It also typically requires the machines to be close to each other, so they would be subject to simultaneous outage through fire, flood, theft, power outage, etc.   Traditional backup with remote storage of backup media is very important.

6. Past that, you get into metro clusters, continuous access, and other solutions that protect from almost any single-site disaster... but these solutions get to be quite expensive, and I won't go in to them unless you have the budget to consider it.

Final notes:
1) RAID is not the answer to your question.  RAID only protects from a disk failure.  If the machine goes down, you're down.   If something happens and your data gets corrupted or deleted, it's all gone.  RAID is a solution to one possible HW failure, it is not a high-availability solution in itself, and it is not backup.

2) Few of the solutions above -- until you get into clustering, perhaps -- require Fibre Channel SANs.  FC SANs are great for some things: sharing storage at a single location, making backup much easier in some ways, high throughput, low latency, and facilitating clustering.  But they're also expensive, and can be pretty complex until you get the necessary knowledge and experience to implement them correctly.
0
 
DiscusfishConnect With a Mentor Commented:
If you want a very speedy recovery, your best bet will be to learn about and deploy a virtual machine architecture.
Otherwise, you should be able to restore from a normal tape/disk backup to a new machine.

To make it really fault-tolerant, you're going to want to look into SANs too.
0
 
senadCommented:
What you are talking about is called RAID.
It does not need another machine but a RAID controller
in the current machine and another disk (same size preferably - but not less
than the main one).
You can learn all about RAID here :
http://en.wikipedia.org/wiki/RAID

0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
DiscusfishCommented:
RAID only applies to the disk subsystem, not the entire machine going down.
0
 
zolfAuthor Commented:

>>to learn about and deploy a virtual machine architecture.
can you please provide more details on these.

>>To make it really fault-tolerant, you're going to want to look into SANs too.
on this also,if you can provide more details,it would be good
0
 
DiscusfishCommented:
VMs allow you to create a computer-within-a-computer - and, being virtual, you can move them around between different hosts - have a  look at VMWARE as an example vendor (ESXI)

A SAN (storage area network) is a networked and very redundant storage solution - in concert, VMs (which should be configured to store their files on the SAN) and SANs will enable you to build a high-availability solution - whether you have the budget to implement this, however, is a different matter! A well-respected vendor here is EMC.

There are solutions in VMWARE that allow you to failover machines in the time it takes to transmit a single packet. They're pricey!

If you want to do something like this more modestly, you could look at running a machine with a hypervisor on it (the "container" for a virtual machine) with a RAID array, which provides some fault-tolerance in the disks, and regularly copying the virtual machine files to another machine - if the first fails, you can start up a hypervisor on the other machine and then load the backup and be up and running once again very quickly.

You need to assess 1) how much downtime you can afford 2) how much money you can afford.
0
 
PaperTigerCommented:
I do this all the time. There are two options and I use both.

Option A: have spare machine available and on site and create clone copy of your exisiting hard drive using Acronis or any disk imaging software of your choice.
 
Option B: Use Acronis' Universal Restore function to restore to different hardware

0
 
zolfAuthor Commented:

>>and create clone copy of your exisiting hard drive using Acronis or any disk imaging software of your choice.
 
can you please provide me guide how to do this process
0
 
PaperTigerConnect With a Mentor Commented:
Go to www.acronis.com and buy the software. As any sys admin, this is the godsend tool that one must have.

this software does 3 important things:

1. Hot backup
If you enable Windows 2003's VSS (volume shadow copy service), you can schedule Acronis to backup your whole machine while it's running - no interruption to your business.

Do this daily

2. Cold backup
Using the Acronis you installed, create a bootable CD. Boot up your computer with this CD and perform cold backup

Do this quarterly or monthly

3. Restore to different hardware
Using the CD, Boot up your computer with this CD and perform restore. If you purchase the Universal Restore option, you have the option to restore your backup to a machine with different hardware.

Do this if you have spare hardware, if not, then when system breaks, you can do this.

I do not see the need to go any fancier than this.
0
 
mottakuttyCommented:
Server Clustering is worth taking a look. Implementing server clustering requires a domain controller and 2 physical boxes. They share a drive outside the box, most cases on a SAN. The drive is connected to 1 physical box at any time and is transferred to the other server when the primary server fails.
This technology provides fault tolerance and continuous data availability.

Cheers
Mottakutty
0
 
hshaoCommented:
Agree with Mottakutty.

Using cluster will have high availability of your application. If one node goes down it will failover to the other one. Users will still be able to use the app and you can fix up the node at the same time.

http://technet.microsoft.com/en-us/library/cc917693.aspx

0
All Courses

From novice to tech pro — start learning today.