Link to home
Start Free TrialLog in
Avatar of PenguinN
PenguinN

asked on

VMware how to rescan storage hardware acceleration level.

I’ve setup a VM cluster with 3 hosts and Lefthand nodes. All worked well (including veeam backup over network) until I let Veeam do a backup over iSCSI. Some VM’s completely crashed some VM’s did backup (small ones) and a number of VM where having undeleted snapshots not shown in snapshot manager (still pointing to snap-000001.vmdk).

From the moment this happened I see a performance drop in the cluster. After investigating and doing a lot of word trying to fix the snapshot issues I noticed the following, all my published Volumes have Hardware Acceleration status unknown. Also there is a volume that has Hardware Acceleration Supported. I’ve rescanned the HBA, data stores etc, but the status stays unknown. VAAi is supported by lefthand and the software should check automatically. If you create a VM or copy a file larger than 4Mb the detection is triggered but my status will not change.

Is there a way to re-enable hardware acceleration level for my volumes.
 User generated image
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image



Check Pg 96 onwards in this document, this will enable you to enable this function, if supported and configured on your Lefthand SAN.

http://www.vmware.com/pdf/vsphere4/r41/vsp_41_iscsi_san_cfg.pdf
You do need to have the vendor supplied plugin to be able to recognize the storage as being capable of VAAI. A good discussion is on pp 129 of the config guide (http://www.vmware.com/pdf/vsphere4/r41/vsp_41_esx_server_config.pdf)

I am guessing from the context of your question that you are using hardware iSCSI with a HBA rather than the built-in software iSCSI. Is it dependent or independent? What vendor and model of HBA are you using. What model of Lefthand storage are you using.

You also indicated that you have at least one volume showing acceleration as supported. What is different about that volume?

Avatar of PenguinN
PenguinN

ASKER

We are using p4300 nodes in combination with dl380g7 dual 6 core machines. The problem with the volumes compared to the supported volume is a mistery to me, the volumes are all published from the same lefthand cluster. Only diffrence is that the hardware supported volume was not accessed during the veeam backup over iSCSI. The trouble started after the veeam backup.

Do you know if you ever had Hardware Accelerated LUNs?
Do other hosts in the cluster see the acceleration as supported? Does the acceleratred status come back after a restart of the ESX(i) host?

Take a look at http://www.vmware.com/resources/compatibility/detail.php?device_cat=san&device_id=10488. Make sure your configuration is a supported one for VAAI - it appears only certain SAN/iQ levels and iSCSI configurations support it.
After the backup with veeam i saw a hudge queu depth  i restarted the esxi machines and vcenter, this helped speed up the machines and vm's. But when i want to create a snapshot or power on a machine it takes minutes instead of second like it did earlier. Als migrating storage takes very long, this used to be minutes. I am know for sure the volumes where supported before. All hosts see the datastores unknown exept for the one shown in the image.
SANIQ 9 is running on the nodes, latest version.
Take a look at http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1021976

The article indicates the status will change on a datastore once you perform some basic VAAI requests:

"What do I need to know about the Hardware Acceleration Support Status?
If you go to Host > Configuration > Storage, you can see the Hardware Acceleration Status on the on the right side of the right panel.
 
For each storage device and datastore, the vSphere Client displays the hardware acceleration support status in the Hardware Acceleration column of the Devices view and the Datastores view.
 
The status values are Unknown, Supported, and Not Supported. The initial value is Unknown. The status changes to Supported after the host successfully performs the offload basic operations. If the offload operation fails, the status changes to Not Supported.
 
To determine if your storage device supports VAAI, you need to test the basic operations.
 
To easily test the basic operations, use vSphere Client and browse the datastore. Copy and paste a virtual disk of at least 4 MB (that is not in use). The status of Hardware acceleration changes to supported or not supported.
 
Creating a virtual machine with at least one vDisk or cloning a virtual machines also tests the basic operations."

Give that a shot and see if the status changes to supported. If it does than I guess I wouldn't worry about it because in the course of normal operations - if ESX(i) needs a VAAI primitive to work, and determines that it does indeed work - the status will update automatically from unknown to either supported or not supported

That article also shows how to verify VAAI is enabled on your host.
I created a virtual machine no success, also moved virtual disks with no change, it stiks to unknown. I also removed the Assinment and reassigned it to a host. Maybe it needs more time but we had the iSCSI backup running on monday and thats when it all started.

For veeam to do an san backup it needs te have the volumes assigned to the backup hosts. Once the backup started things got ugly. I removed the Veeam host from all volumes. But i am sure that this caused the degrade. For now i disabled veeam backups because snapshots via the api instuctions timeout.

I think I need to force a hardware check on the VAAI plugin to make it functional again. Veeam support is pointing to vmware, but first i need to get things back to normal. Like you pointed out maybe time is the issue but the status unknown is almost a week now. Rhe only update not installed on the hosts is the one released a view days ago.

I really appricate the input by the way.
SOLUTION
Avatar of bgoering
bgoering
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I will run some more checks on the VAAI via the vm cli. Monday morning i will  open case with vmware. I already spoke to veeam with no avail. I keep you posted on the outcome. Thanks for the help this far. If there are any changes in the status i will post. I never had this issue before and hope vmware can shine a light on the case.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I did a lot of reasearch on this and traced it down to the automount function somehow being enabled again. Normally with an veeam installation automount should be disabled. I did this but somehow true time it got enabled again. Windows now initializes the disks and your vmfs stores get corrupted. The workaround you found was the one that saved my day in the end as wel. Recreating all datastores and moving the virtual machines to supported luns.

I also have seen vmfs volumes jumping back to supported again after switching to thin provisioning on the lefthand. This was only for a view volumes. I ended up recreating all volumes and moving machines. The moving and copying was pretty nasty because of the IO generatd by other VMs. I ended up leaving a minimal number of machines running on a store before starting cloning and moving actions. Some machines had to be moved by installing the convertor into the VM and exporting them to external storage and then importing them again. I now also disabled write access to the luns for the Backupserver running windows. Downside to this is that whenever I run a restore it goes via the production network. But I wanted to be sure this never can happen again.
This was a really nasty problem to handle. There is no real solution up to this point, only a workaround. HP/VMware/Veeam support did not help have an accurate answer to fix this problem. VMware was responding the most accurate and helpful but could not find the solution to the problem from logfiles.