NetApp predictive disk failure

Posted on 2014-08-30
Last Modified: 2014-08-31

I have a 2246 SAS shelf and I'm tired of waiting for disks with predictive failures to fail before hot swapping so my question is how would I properly fail a good disk that is only predicting to fail? Here are some of the commands but I'm not sure which ones to use.

Disk Fail command forces a file system disk to fail and will be selected for Rapid RAID Recovery and copied to a spare. You must use the disk swap command afterwords when using SCSI disks. (we use SAS, does this not apply?)


Disk Replace command can be used to replace a file system disk with a more appropriate spare disk.
Followed by a disk swap?

It's not clear in the man disk pages on which to use in a predictive failure situation.

Thanks in advance

Question by:snyderkv
    LVL 42

    Assisted Solution

    You can use either or.  Disk replace is less intrusive as it copies valid data before taking the original offline.  Disk fail causes degrading in the raid dp as it has to rebuild the data, not a big deal but just a little more work for the system.
    LVL 9

    Expert Comment

    "disk replace" is definitely better. If you are manually replacing disks to avid them failing, the last thing you want to do is to deliberately degrade RAID to replace the disk - the effect of this would be the same as waiting for the disk to fail.

    Having said that, ONTAP is very good at managing disks and disk failures. It's Maintenance Center feature can predictively replace disks, take them offline for testing and return them if needed. It is unlikely that you can do a better job manually. Also, it is unlikely NetApp Support will replace disks you remove manually. So my recommendation is to let ONTAP manage disk failures.
    LVL 42

    Expert Comment

    If you call Netapp Support and send them the error with predictive failure on the drives they will send a replacement. Sort of like having a slow leak in a tire, you don't want to wait until it goes flat even though you have a spare.

    Author Comment

    The replace command does not work. It says it's not a file system disk. It's listed as predictive failure however, it shows up under broken disks. For some reason it's predictive but no longer a file system disk or part of an aggr or rdgrp therefore I cannot use the replace command.

    I guess I would just blink the drive and hot swap it as if it were a failed drive? Since it did not fail, why did it take it out of the raid group and essentially fial the drive?
    LVL 9

    Accepted Solution

    If the disk is listed under broken disks, it has already been failed and replaced. It seems that ONTAP used the predictive mechanism to proactively replace the drive before it failed. If you are interested in exactly what happened with that disk, search the messages files for the name of the disk and you will see events describing the exact sequence.

    Also, chances are the orange light is on on that disk, so no need to blink.

    Author Comment


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Threat Intelligence Starter Resources

    Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

    Solid State Drive Performance Tips: Solid state storage technology is now a standard.  After testing and using several different brands and revisions of SSD's over the years I have put together a collection of tips,tools and suggestions that I ha…
    this article is a guided solution for most of the common server issues in server hardware tasks we are facing in our routine job works. the topics in the following article covered are, 1) dell hardware raidlevel (Perc) 2) adding HDD 3) how t…
    This video Micro Tutorial explains how to clone a hard drive using a commercial software product for Windows systems called Casper from Future Systems Solutions (FSS). Cloning makes an exact, complete copy of one hard disk drive (HDD) onto another d…
    This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now