Amr Sayed
asked on
RAID 10 "Logical Drive has failed and cannot be used. All data on this logical drive has been lost" with one failed disk only
Hi,
I've an HP ML350 G9 server that has a failed logical drive, this logical drive was setup RAID 10 and I can only see from the log (see below) that there is only one failed disk why am I getting the "Logical Drive 3 has failed and cannot be used. All data on this logical drive has been lost" the RAID 10 should recover till 2 failed disk drives?
can you please help?
Details:
Smart Array P440ar
Critical Status Message(s)
274 0 GB SAS HDD at Port 2I : Box 6 : Bay 6 is bad or missing. To correct this problem, check the data and power connections to the physical drive. For more information, generate a diagnostics report.
298 Array C - 1 Logical Drive(s) contains a failed physical drive. To correct this problem, check the data and power connections to the physical drives or replace the failed drive. For more information, generate a diagnostics report.
271 Logical Drive 3 has failed and cannot be used. All data on this logical drive has been lost. Configuration changes to this logical drive are not allowed until this problem is corrected. Also, if your controller supports Expansion, Extension, or Migration, these operations will not be available for any logical drives in the array until the problem is corrected. Replace any failed physical drives and re-enable the failed logical drive. For more information, generate a diagnostics report.
Warning(s)
822 The cache for Smart Array P440ar in Embedded Slot has been disabled because there is no battery/capacitor attached to the cache module.
341 300 GB SAS HDD at Port 2I : Box 6 : Bay 5 is predicted to fail soon.
341 300 GB SAS HDD at Port 2I : Box 6 : Bay 8 is predicted to fail soon.
ADUReport.txt
I've an HP ML350 G9 server that has a failed logical drive, this logical drive was setup RAID 10 and I can only see from the log (see below) that there is only one failed disk why am I getting the "Logical Drive 3 has failed and cannot be used. All data on this logical drive has been lost" the RAID 10 should recover till 2 failed disk drives?
can you please help?
Details:
Smart Array P440ar
Critical Status Message(s)
274 0 GB SAS HDD at Port 2I : Box 6 : Bay 6 is bad or missing. To correct this problem, check the data and power connections to the physical drive. For more information, generate a diagnostics report.
298 Array C - 1 Logical Drive(s) contains a failed physical drive. To correct this problem, check the data and power connections to the physical drives or replace the failed drive. For more information, generate a diagnostics report.
271 Logical Drive 3 has failed and cannot be used. All data on this logical drive has been lost. Configuration changes to this logical drive are not allowed until this problem is corrected. Also, if your controller supports Expansion, Extension, or Migration, these operations will not be available for any logical drives in the array until the problem is corrected. Replace any failed physical drives and re-enable the failed logical drive. For more information, generate a diagnostics report.
Warning(s)
822 The cache for Smart Array P440ar in Embedded Slot has been disabled because there is no battery/capacitor attached to the cache module.
341 300 GB SAS HDD at Port 2I : Box 6 : Bay 5 is predicted to fail soon.
341 300 GB SAS HDD at Port 2I : Box 6 : Bay 8 is predicted to fail soon.
ADUReport.txt
You can loose two disks in a RAID 10 with 4 drives
actually, a RAID 1+0 with 4 physical disks has 2 groups with 2 disks in one group (in your case the groups are bay 5 + 6, and bay 7 + 8)
Raid 1+0
4 Disks:
Disk1 Disk2 Disk3 Disk4
----- ----- ----- -----
| a | | a | | b | | b |
| c | | c | | d | | d |
----- ----- ----- -----
G1 = {D1, D2}
G2 = {D3, D4}
all data is lost if both disks of a group failed.
the whole array was disabled because the risk of lost data is too high. you should be able to activate it again, after the disk in bay6 was replaced. however you may consider to clone the weak disks in bay 5 and bay 8 before such that you have a backup in case it would fail.
so i would do:
- clone the weak disks of bay 5 and bay 8 for example by using a desktop system and some clone tool
- put the weak disks back to the RAID and replace bay 6 disk by an empty new disk.
- activate the array and let the system fill the empty disk.
- if this fails because of any of the weak disks, try to exchange it with one of the clones.
- after you recovered exchange the weak disks withe new disks or with the clones
- (alternatively, you may consider to removing all disks from the array and use the two cloned disks plus two new disks to repair the array)
Sara
If you upload an ADU report I'll look through it for you. Turn XML off if possible it's nearly impossible to read with that option turned on.
ASKER
Thank you all for your help here ...
ADUReport.txt
If you upload an ADU report I'll look through it for youAttached
the whole array was disabled because the risk of lost data is too highif it's disabled only, what does this log means! "271 Logical Drive 3 has failed and cannot be used. All data on this logical drive has been lost. "
ADUReport.txt
2I:6:6 has recently failed but 2I:6:8 had previously failed or at least wasn't used although it was meant to be.
4 Physical Drive (300 GB SAS HDD) 2I:6:5 Physical Drive (300 GB SAS HDD) 2I:6:7 Informational
5 Physical Drive (0 GB SAS HDD) 2I:6:6 Physical Drive (300 GB SAS HDD) 2I:6:8 Informational <------ failed
6 Physical Drive (300 GB SAS HDD) 2I:6:7 Physical Drive (300 GB SAS HDD) 2I:6:5 Informational
7 Physical Drive (300 GB SAS HDD) 2I:6:8 Physical Drive (0 GB SAS HDD) 2I:6:6 Informational <------- 0GB used on this disk !
Perhaps 2I:6:8 had been previously replaced and failed to rebuild.
So two disks in a mirror dead. Do you have a backup?
4 Physical Drive (300 GB SAS HDD) 2I:6:5 Physical Drive (300 GB SAS HDD) 2I:6:7 Informational
5 Physical Drive (0 GB SAS HDD) 2I:6:6 Physical Drive (300 GB SAS HDD) 2I:6:8 Informational <------ failed
6 Physical Drive (300 GB SAS HDD) 2I:6:7 Physical Drive (300 GB SAS HDD) 2I:6:5 Informational
7 Physical Drive (300 GB SAS HDD) 2I:6:8 Physical Drive (0 GB SAS HDD) 2I:6:6 Informational <------- 0GB used on this disk !
Perhaps 2I:6:8 had been previously replaced and failed to rebuild.
So two disks in a mirror dead. Do you have a backup?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The big RED/Port Wine "do not remove" light is often misinterpreted as the fault LED on these caddiesWhat if that was the cause of this issue, how can I rebuild the array, I don't have a backup because I was in the middle of a migration process
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Note: this must be done cold with the server off as opposed to hot-plugging a replacement. The reason is the same in both cases, with hot-swap the controller treats it as a new drive, with power off it reads the metadata on all the disks to make sense of the configuration.
Did it come up OK when you put the good disk back in and rebooted?
Smart Array controllers do not disable arrays "because the risk of data loss is too great" the array is either online or dead. In this case the wrong disk was swapped out causing a double fault.
I don't think your data is lost it's just inaccessible.