Puncture and Bad Block unrecoverable medium error - Dell 2950 Array

Hi All,

We have a Dell 2950 with the array in degraded state. History - 2 drives failed and with the help of expert exchange we were able to force one of the drives back online and rebuild the other, which brought the server up. We have a good backup. One of the two drives failed, the global spare kicked in, but failed with Puncture and bad block medium errors. Attached is a pic of the array and drive state. Need some assistance with getting the array back to a healthy state (we want to use a utility such as disk2vhd, but that fails because of the drive errors)

We have spare drive,  but it seems like its an array issue not drive. Reading about re-creating the array, running consistency check. Booting up gives a foreign config detected to import, which we did not. There are array firmware updates, but didn't want to do any of the above without getting some insight. Please let us know your suggestions.

DellRaidfail.JPG
jaya31Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Seth SimmonsSr. Systems AdministratorCommented:
We have spare drive,  but it seems like its an array issue not drive.

correct
with a good backup (hopefully there is no data corruption with the puncture and media errors) you should replace the problem disk (at least disk 3 that is in a failed state), destroy the array and create a new one then do a restore from the backup

There are array firmware updates...

you won't be able to do any controller firmware update when the array is degraded
if you get the failed physical disks replaced and build a new array, once it's good then you can do the perc firmware update

on the other hand, i would also look at replacing the server.  i was buying 2950's 10 years ago.
who knows what else might fail...
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Mal OsborneAlpha GeekCommented:
I have looked after a dozen or so 2950s. Out of those, I have seen 4 with failed RAID controller memory, and very few other problems. The memory is just a standard SIMM, but it has to be the exactl correct type. Failed memory invokes all sorts of wierd errors.
0
andyalderSaggar maker's framemakerCommented:
You already have a spare disk in the enclosure, you can use OMSA to delete the metadata on the foreign disk then if needed set it as a global spare and it will use that. right click on PERC 6/i and clear foreign should be listed under available tasks. You can also get the controller log from there to see if you really do have punctured stripes.

Punctured stripes and bad blocks can only be cleared by overwriting those sectors with the OS though, and since chkdsk will have marked them as bad there's no easy way to do that. Only real way out of the problem is to wipe it out, shuffle the disks and restore from backup.

Please note where https://www.dell.com/support/article/us/en/04/sln111497/double-faults-and-punctures-in-raid-arrays?lang=en#3 says
that good disks in the array will get marked as predictive failure even though there is nothing wrong with them.
0
jaya31Author Commented:
Hi All,

Thank you very much for your direction an help in understanding how to proceed in a situation like this. Our goal was to get a virtual snapshot of the server to preserve an old unsupported application that we couldn't get with a restore from back only file structure, which we were able to do and now the server can be retired. I going to play around with the array to see if at all we can get back to some working order, but we dont need the server anymore just for experimenting.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
raid5

From novice to tech pro — start learning today.