Rio_10
asked on
MSA P2000 G3 - Scrub-vdisk
All,
I was notified of an error on our SAN
Event:
A scrub-vdisk job completed. Errors were found. (number of mirror-verify errors found: 140414, number of media errors found: 0) (vdisk: CYHQSAN01VD02, SN: 00c0ff1a41360000865d8b5200 000000)
EVENT ID:#B1340
EVENT CODE:207
EVENT SEVERITY:Error
I logged into the SAN and noticed that the system was in a healthy condition and thought nothing of the error.
A few days later I had to reboot a VM (running on vSphere 6) and the server wouldn’t reboot. After hours of trying to repair windows I restored the VM from a backup but this also wouldn’t boot. I restored once again but this time to another vdisk and all is ok.
Again today I had to restart a VM and the same thing happened. It appears that everything on this vdisk is corrupt.
I cant believe this has happened and the MSA does not report a hardware issue.
Has anyone come across this before? What steps can I perform to keep the data or is it a matter of restoring to another vdisk and then recreated the failed vdisk.
How can I prevent this in future? is SAN snapshots an option?
Thanks in advance
EVENT ID:#B1340
EVENT CODE:207
EVENT SEVERITY:Error
I logged into the SAN and noticed that the system was ina healthy condition and thought nothing of the error.
A few days later I had to reboot a VM (running on vSphere 6) and the server wouldnt reboot. After hours of trying to repair windows i restored the VM from a backup but this also wouldnt boot. I restored once again but this time to another vdisk and all is ok.
Again today I had to restart a VM and the same thing happened. It appears that everything on this vdisk is currupt.
I cant believe this has happened and the MSA does not report a hardware issue.
HAs any one come across this before. What steps can i perfom to keep the data or is it a matter of restoring to another vdisk and then recreated the failed vdisk.
How can i prevent this in future? is SAN snapshots an option?
Thanks in advance
I was notified of an error on our SAN
Event:
A scrub-vdisk job completed. Errors were found. (number of mirror-verify errors found: 140414, number of media errors found: 0) (vdisk: CYHQSAN01VD02, SN: 00c0ff1a41360000865d8b5200
EVENT ID:#B1340
EVENT CODE:207
EVENT SEVERITY:Error
I logged into the SAN and noticed that the system was in a healthy condition and thought nothing of the error.
A few days later I had to reboot a VM (running on vSphere 6) and the server wouldn’t reboot. After hours of trying to repair windows I restored the VM from a backup but this also wouldn’t boot. I restored once again but this time to another vdisk and all is ok.
Again today I had to restart a VM and the same thing happened. It appears that everything on this vdisk is corrupt.
I cant believe this has happened and the MSA does not report a hardware issue.
Has anyone come across this before? What steps can I perform to keep the data or is it a matter of restoring to another vdisk and then recreated the failed vdisk.
How can I prevent this in future? is SAN snapshots an option?
Thanks in advance
EVENT ID:#B1340
EVENT CODE:207
EVENT SEVERITY:Error
I logged into the SAN and noticed that the system was ina healthy condition and thought nothing of the error.
A few days later I had to reboot a VM (running on vSphere 6) and the server wouldnt reboot. After hours of trying to repair windows i restored the VM from a backup but this also wouldnt boot. I restored once again but this time to another vdisk and all is ok.
Again today I had to restart a VM and the same thing happened. It appears that everything on this vdisk is currupt.
I cant believe this has happened and the MSA does not report a hardware issue.
HAs any one come across this before. What steps can i perfom to keep the data or is it a matter of restoring to another vdisk and then recreated the failed vdisk.
How can i prevent this in future? is SAN snapshots an option?
Thanks in advance
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Any disk or controller replaced recently?
ASKER
Thanks for the reply, HP have the logs and are investigating. What a day!
Had to restore 9 VMs from the weekends backup, obviously to a different SAN.
Nothing has changed.
Has anyone come across SAN snapshots? they are not configured, I`m wondering if they would be beneficial to use, I could of restored to a previous snapshot to save the current state of the VM rather than losing 2 days worth of data.
Had to restore 9 VMs from the weekends backup, obviously to a different SAN.
Nothing has changed.
Has anyone come across SAN snapshots? they are not configured, I`m wondering if they would be beneficial to use, I could of restored to a previous snapshot to save the current state of the VM rather than losing 2 days worth of data.
They are of no benefit if you SAN fails, you need another SAN to send/replicate/mirror the snapshots
We feel your pain.
Been there with at least 5 clients with HP MSA, with a total restore of VMs, our clients no longer use HP.
We feel your pain.
Been there with at least 5 clients with HP MSA, with a total restore of VMs, our clients no longer use HP.
ASKER
Are you unable to snapshot to another vdisk on the same SAN?
Its the first time that i have had an issue with HP, but you invest in having redundant controllers and RAID and then rhis happens. How can the SAN report normal health is beyond me.
Its the first time that i have had an issue with HP, but you invest in having redundant controllers and RAID and then rhis happens. How can the SAN report normal health is beyond me.
Snapshots exist in the volume/vdisk. (same location as the original data)
You can ship the snapshots to a second remote mirror of the vdisk, but obviously you would have to have already set that up previously.
ASKER
I have been working with HP on this issue.
I performed a firmware update over the weekend but the error is still there.
I now have a comact flash error and i will be shutting down and reseating the controllers as per there request to see if it clears the error.
I performed a firmware update over the weekend but the error is still there.
I now have a comact flash error and i will be shutting down and reseating the controllers as per there request to see if it clears the error.
ASKER
Replaced CF cards and vdisk is ok now, however we had to restore from backups as all the VMs on that vdisk would not boot