Avatar of LICOMPGUY
LICOMPGUY
 asked on

Vsphere Essentails 5.1x, single host USB 4tb drive vmdk backup?

Hey all

I have a small client running ESX Essentials 5.1, have a drive with a per-emptive failure, raid5, no hot spare,  I need to take the drive off line, I was thinking, just to cover myself, can I add a 4tb USB drive to the host, what would I need to do for ESX to temporarily see it (until I fail/off line the drive), and it finishes rebuilding,  and make this 4tb usb external drive available and formatted as a vmfs partition - so I can hot clone the Guests to the external drive, in case all hell breaks lose.
Or for the cost of the drive - I can just leave it attached and use it, in addition to backup.
Is this doable? If so how?  Thanks!!!!
VMwareVirtualizationStorage SoftwareAcronis

Avatar of undefined
Last Comment
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

8/22/2022 - Mon
SOLUTION
jmcg

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
ASKER CERTIFIED SOLUTION
Alessandro Scafaria

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
SOLUTION
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
LICOMPGUY

ASKER
Thanks for your response 5 hosts, 1 appliance for Acronis backup.  2.73 TB is the data/vmfs partition with the drive on the verge.  ESX is on a mirrored pair of 300GB 15k drives.  Ram just over 36GB.

I use Acronis so all guests are backed up to tibs, God forbid if I lose the guests would probably have to restore the backup server before able to pull the tibs of the disk cartridges.  I currently have the backup server residing on the ESX partition not the one with the drive with the per-emptive failure.

They want to replace the raid battery too - was going to put off until Friday - just in case.
Thoughts/ideas?

Thanks!!!!
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

Get the disk replaced ASAP!

then worry about the battery!
LICOMPGUY

ASKER
What if I do a hotclone of each machine to the vmfs partition (have no where else to put it), then use winscp to copy it to an external usb hanging off of a workstation?
I have a dc (so really wont be anything changing)
appserver running av
appserver running .net app (so this can change).
f&p - changes daily (500gb data part) changes constantly (daily).
vcenter server
backup server (on ESX partition).

Just a thought..
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
LICOMPGUY

ASKER
So you are suggesting to NOT replace the batt at the same time?  I was thinking that.
Was going to try to hold out on disk replacement until friday - just in case there is an issue and need the weekend window.
What is your experience with taking a raided drive offline in a dell box, it should rebuild just fine generally?
Alessandro Scafaria

I'm sorry but I'm afraid I didn't understand your scenario.....

1. What's inside your "almost HDD broken" host? Production VMs?
2. Where do you put your VMs backups? All in tapes? Or did you have also a NAS?
3. You said that you have 5 hosts totally.....

Here is what I would suggest.....

If the broken hdd is in a host with production VMs, check your free diskspace in other hosts in order to move your VMs inside....

Example
You have to replace the hard drive in one host with 4 VMs running and each of them has 300GBs of space occupied....
You may schedule to replicate your VMs temporarily in your other hosts during your "maintenance window" in order to have a 100% availability for your end users.....and then move back your VMs in the original host when all is done!

How? (I would suggest to use Veeam software instead of Acronis for a lot of reasons)

1. Create a Windows server based VM in a 100% healthy host and download a 30 days trial version of Veeam Backup and Replication from there:

http://www.veeam.com/data-center-availability-suite.html

2. Add all your hosts in the backup infrastructure (I would suggest to enter manually your hosts and not the IP of your vcenter server in order to be 100% free from any vCenter dependaces)......

3. Perform your VM replicas followind this step-by-step guide:

http://helpcenter.veeam.com/backup/80/vsphere/replica_job.html

Also read this useful articles....

http://helpcenter.veeam.com/backup/80/vsphere/replication.html

http://helpcenter.veeam.com/backup/80/vsphere/failover_failback.html

http://helpcenter.veeam.com/backup/80/vsphere/permanent_failover.html

4. Test your replica before doing all in your production environment (with Veeam you can safely!!):

http://helpcenter.veeam.com/backup/80/vsphere/recovery_verification_surereplica.html

5. Once you're sure inside Veeam SureBackup Virtual Lab environment, perform your replica before your maintenance window in the production environment....

6. Once all is done in your "broken server", you may "Failback" and come back to "normal"....

http://helpcenter.veeam.com/backup/80/vsphere/performing_failover_and_failback.html

7. Try to seriously evaluate Veeam instead of Acronis for VM protection against damages.....

Let me know your thoughts.....
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

Hotclone, is better than no backup!

No, don't change the battery at the same time.

The disks are hot plug, remove the drive, and insert A new disk, and rebuild should start.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
LICOMPGUY

ASKER
Hey Alessandro - thanks for the response.
Small environment - local storage single host - that is the problem.  
5 prod VMs on this host.
Once this is behind me - I would like to pick your brain on Veeam vs Acronis.

Thanks!
LICOMPGUY

ASKER
Andrew

So you are thinking basically should have no problem taking one of the drives offline in the raid5 array - and just letting in rebuild - correct? I have done this before on other servers, just want to be prepared JUST in case.

Thank you for your time and efforts in helping me!
LICOMPGUY

ASKER
The vmfs partition in question has 6 600gb sas 15k drives, as per esx it sees 2.73TB - so the space is being used by parity.  That being said, taking the one drive (drive 7 offline), the drive with the pre-emptive error wouldn't create blowing out that whole raid5 array - agreed?  Then just replacing the drive it would rebuild from the party on the other 5 drives. Agreed?
Just need to be careful as possible here.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

no contest here - Veeam vs Acronis - Veeam is the clear winner!

So you are thinking basically should have no problem taking one of the drives offline in the raid5 array - and just letting in rebuild - correct? I have done this before on other servers, just want to be prepared JUST in case.

Correct.

If RAID 5 is working correctly, it can sustain, 1 disk fail.
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

You can also visit this EE Question, where there is a discussion about

Veeam vs Acronis
Alessandro Scafaria

LICOMPGUY, Andrew made his point!

You may perform you replacement accordingly without any maintenance window (RAID5 is able to sustain 1 drive fault)....

My considerations are related to avoid damages for futures with Acronis ;-)
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
LICOMPGUY

ASKER
Alessandro  

Thanks for your response.  Being that the drive is pre-emptive failure, I would need to take it off line, which being that it is an esx host, I would have to do it from the raid controller config, so I would have to bring the host down - at least I dont know of another way.  It is a Dell box t710.  I would think it would be dangerous to just pull the drive without it failing, if I were to not take it off line.

You will have to update me on why you don't like acronis. I have recovered VMs from Aconis 11.5 before, Their virtual edition seems better, but would love to know what you are disliking about it.

What I really need to do is see if I can bring up their old server with esx, and replicate the vms to it, with a delta diff - for high availability.  (Different topic - may want to pick your brain on that going forward).

Am I missing something about being able to fail the drive without going into the raid config and therefore bringing the host down?

Thanks!!!!
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

What is the difference between the disk dying, and not responding, and being removed! (and not present!)

A Good RAID Storage Controller, under RAID 5 will continue.

Acronis is a low budget poor performing desktop application compared to an Enterprise Solution like Veeam Backup and Replication, with a pedigree of supporting Virtual Environments for many years.

Acronis, just want to confuse, and sell an inferior product, in the virtual space, to gain some market share!
LICOMPGUY

ASKER
I like what i see in Veeam a lot,  Acronis has ver 11.5 and also their VMware ver - which is web-based, haven't had an opportunity to work with it.  11,5 I have, it is "ok".  I have to say I like what I see in Veeam a lot, twice the cost but great product.

I didnt want to risk just pulling the drive because it is still online, just in pre-emptive failure mode, yet fully functional, rather take off line and replace to play it safe - you agree?

Thanks Andrew
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

I didnt want to risk just pulling the drive because it is still online, just in pre-emptive failure mode, yet fully functional, rather take off line and replace to play it safe - you agree?
No, not really, get a space and replace,

and ensure you have full backups, and tested backups and restores, just in case of complete failure.

Do you have support agreement for the server, if it was HP, Dell, IBM, and you called their support department, they would have dispatched a disk to replace it with.

Remember, if all the disks in the server, are or a similar age, or batch, there is a risk they could all fail together.

Especially at a RAID 5 rebuild, which is the most stressful, disk intensive time, often if one disk fails, disks of same age or batch, can also fail at the rebuild.
Dawid Fusek

mate,

1. RAID 5 is generally even in ONLINE state a risky solution (only one disk redundancy and very long and intensive rebuild after one disk failed), but in DEGRADED state RAID 5 is really risky solution because of much more risk than a single disk (exactly risk is equal of disks in RAID 5 in a DEGRADED state, so in 6 disk RAID 5 it will be 5x more risky than single disk + additional risk becuase of long intensive rebuild), so summary risk of rebuilding 6 disks RAID 5 from DEGRADED state is probably 10x higher than a single disk !!!!!.
2. you have 15k SAS disks, it's very good, they are fast and much more stable and less risky than sata disks, so your risk is much lower than same array on sata disks (probably up to 3 times).
3. there are some RAID Controllers that may do "some magic" like copying (clone) sinegle pre-failure disk to a clean disk but it's not your case I think.
4. You have to rebuild this RAID 5 (switch from pre-failure disk to new one) during server operation, so when array is online, most raid controllers replace then failed disk to new inserted in the same slot (and may not do that automatically when you do same operation when server is power off), sure some controllers need some configuration to do so and may be configured to not do it automatically but in general 95% will do that automatically, but what you have to do is to really be sure that you take offline a correct drive, wait 1 min and then put new drive in the same slot (but remember, that should be new or clean drive, not a drive from other server that may contain any part of other array because then some array controllers may not rebuild actual array because they see new array on that disk!!!)
5. what you have to do in that situation? Wait for a weekend, make 2 different good backup (maybe one by acronis or better veeam) and second by online cloning vm's somewhere else to be sure you have 2 different backups and then take off that pre-failed disk, wait up to 1 min, and then put in new disk... and wait, rebuild will took from 6h to sometimes even 72h, depends about number of disks and capacity of a disks and speed of a disks, in your situation it should take probably close to 6-12 hours

regards
NTShad0w
LICOMPGUY

ASKER
Hey

Thanks for your help.  We are good here.  I did a clone of each VM - (only one host), to the vmfs partition then using Winscp copied to a 4tb drive so it gave me a second means of recovering the entire VM.  Also did two full backups using Acronis of all VMs - replaced drive, rebuilt, all is good.

Thanks!!!
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

well done!
LICOMPGUY

ASKER
Andrew

Thank you very much for everything you were very helpful.
Best to you!
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

do you require any additional help to close this question?
Your help has saved me hundreds of hours of internet surfing.
fblack61
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

Final time, is there a solution here ?