VMware Disaster Recovery 2.0 and the powers of Deduplication
This article was originally written for our internal staff members comprised of mostly software engineers.
One of my biggest concerns when taking over as the Network Admin at Pyramid was the poor backup plan in place. File level backups via tape were incomplete (I have since resolved these) but worse was the backups taking place on the VM environment. Backing up the VM environment is becoming more critical as Production servers are moved to the Virtual Hosts.
–Inherited was 6 CPU licenses and is the legacy system for backing up the Virtual Environment. The problems with this are countless. First issue is we have 10 CPU’s on our ESX environment, meaning I’ve been playing musical licenses for the last 4 months. The cost to add 4 more CPU’s is below plus roughly $350 x 10 = $3500 annual maintenance.
4 - MNT VRANGER BACKUP AND REPLICATION /CPU = $2,365.00
Also I get countless errors like this
VM Name: PYRAMEDIA
Start Time: 3/28/2012 11:32:14 PM
End Time: 3/28/2012 11:32:45 PM
Duration: 0 minute(s)
Archive Size: 0 GB
Message: An internal error occurred during execution, please contact Vizioncore support if the error persists. Error Message: Index was outside the bounds of the array.
Message: vmlab1.pyramid-solutions.com is not licensed for the Backup feature.
Message: An internal error occurred during execution, please contact Vizioncore support if the error persists. Error Message: 2254 – Unable to read or write to the repository. Please ensure permissions are correct
To top off, when the backups fail they are leaving dummy files in the data store which do not restore correctly.
This results in the following on restore.
Uhh ohh! So yes basically vRanger reliability bites and we pay for it!!
VMware Disaster Recovery 2.0 – While looking for a new solution I wanted something more reliable, faster, and cost effective. Insert VDR2.0 for Vcenter5. It is free with our ESX5i licensing and was intended for small to midsize use. The features are amazing including: Microsoft Windows Volume Shadow Copy Service (VSS) quiescing ( point-in-time copies), 8 Simultaneous backups, It is possible to full disaster recovery even when there isn’t any vCenter server available (this is because the backup structure is stand alone), and it has Deduplication! See: http://www.ivobeerens.nl/2011/09/06/things-to-know-before-implementing-vmware-data-recovery-vdr-2-0/
The limits are 100 Virtual Machines and 1 TB Backup store size per VDR instance. However – you can have up to 10 VDR instances in your vCenter meaning we could support 1000 VM’s!!!
The GUI is built in vCenter (does require a client install to access) and is very intuitive.
The impact of Deduplication – Data deduplication (often called “intelligent compression” or “single-instance storage”) is a method of reducing storage needs by eliminating redundant data. Only one unique instance of the data is actually retained on storage media, such as disk or tape. Redundant data is replaced with a pointer to the unique data copy. Got that? more here: http://searchstorage.techtarget.com/definition/data-deduplication
Essentially it allows for what individually takes large amounts of storage and allows it to be store more efficiently. For example:
only 91Gb for a weeks worth of daily backups for 10 servers. wowzers!!
Our current vRanger backups take
for only about 50 single instance backups. There are not daily,weekly, monthly archives of backups for the same server.
The key is Deduplication is a smart backup that only writes changes in the block level data and keeps a log of changes so a point in time can be reconstructed. Using this our new backups will include:
Lastly major factor is the time required. vRanger essentially rewrites the entire VMDK file every time is takes a backup. VDR is a smart back up. vRanger only allows for 2 simultaneous backups, VDR does 8.
Initially the backup with VDR is time consuming. About 12GB/hour
But after that is only logs the differences so it can the be restored to make a perfectly working snapshot in time. And it backs up at a whopping 150GB/hour.
vRanger on the other hand uses the initial slow rate over and over. (12GB/hr) repeatedly. Plus it overwrites its previous backup.
VM Name: PYRAMEDIA
Start Time: 3/22/2012 7:00:49 PM
End Time: 3/22/2012 8:44:45 PM
Duration: 104 minute(s)
Archive Size: 19.542 GB
Last thing for reference – a full restore on a 30GB machine took about 2 hours. And it actually booted…
What does this mean?! Engineers can stop retaining snapshots for the last 2 years. Not only do they corrupt and degrade the VM and host and waste tons of space, now there will be a System in place to go back to a machine from a year previous. See: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1025279
Pyramid is saving roughly $6000 this year and $3500 annually.
Time and space for backups has been reduced significantly.
There is a true disaster recovery solution in place once NAS1 is replicated to NAS2(offsite). The VDR can be restored on any ESXi 5 box from scratch.
We need to upgrade our remaining ESX3.5 boxes a.s.a.p. They are outdated and the backups from last month aren’t even reliable.
*Currently 2 ESXi5 hosts are using VDR 2.0
Thank you for reading..let me know if you have an comments or concerns.