NetApp predictive disk failure


I have a 2246 SAS shelf and I'm tired of waiting for disks with predictive failures to fail before hot swapping so my question is how would I properly fail a good disk that is only predicting to fail? Here are some of the commands but I'm not sure which ones to use.

Disk Fail command forces a file system disk to fail and will be selected for Rapid RAID Recovery and copied to a spare. You must use the disk swap command afterwords when using SCSI disks. (we use SAS, does this not apply?)


Disk Replace command can be used to replace a file system disk with a more appropriate spare disk.
Followed by a disk swap?

It's not clear in the man disk pages on which to use in a predictive failure situation.

Thanks in advance

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Paul SolovyovskySenior IT AdvisorCommented:
You can use either or.  Disk replace is less intrusive as it copies valid data before taking the original offline.  Disk fail causes degrading in the raid dp as it has to rebuild the data, not a big deal but just a little more work for the system.
"disk replace" is definitely better. If you are manually replacing disks to avid them failing, the last thing you want to do is to deliberately degrade RAID to replace the disk - the effect of this would be the same as waiting for the disk to fail.

Having said that, ONTAP is very good at managing disks and disk failures. It's Maintenance Center feature can predictively replace disks, take them offline for testing and return them if needed. It is unlikely that you can do a better job manually. Also, it is unlikely NetApp Support will replace disks you remove manually. So my recommendation is to let ONTAP manage disk failures.
Paul SolovyovskySenior IT AdvisorCommented:
If you call Netapp Support and send them the error with predictive failure on the drives they will send a replacement. Sort of like having a slow leak in a tire, you don't want to wait until it goes flat even though you have a spare.
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

snyderkvAuthor Commented:
The replace command does not work. It says it's not a file system disk. It's listed as predictive failure however, it shows up under broken disks. For some reason it's predictive but no longer a file system disk or part of an aggr or rdgrp therefore I cannot use the replace command.

I guess I would just blink the drive and hot swap it as if it were a failed drive? Since it did not fail, why did it take it out of the raid group and essentially fial the drive?
If the disk is listed under broken disks, it has already been failed and replaced. It seems that ONTAP used the predictive mechanism to proactively replace the drive before it failed. If you are interested in exactly what happened with that disk, search the messages files for the name of the disk and you will see events describing the exact sequence.

Also, chances are the orange light is on on that disk, so no need to blink.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
snyderkvAuthor Commented:
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.