How to Diagnose issue with Disk on STOREDGE A1000 Array

Posted on 2008-11-04
Last Modified: 2013-11-14
Hello, apologies for being vague on this, but I have limited experience with SUN. I have a SUN Fire V440 Machine which has storage array STOREDGE A1000 connected to it.  There is amber light one of the disks which is on the array. Im trying to figure out where to look on the machine for any diagnostics that can be run for the storage array (actually not sure if there is such a thing). Im just trying to figure out what to do next and steps to resolve. Im sure that I can just replace the disk with similar disk and hope for the best, but want to know more about it. Any information leading to resolution would be appreciated.
Thanks in advance!
Question by:smi-admin
    LVL 21

    Expert Comment


    Run the rm6 admin utility:


    There's an option to perform a system healthcheck.


    Author Comment

    I ran it and it returned nothing back

    [oracle@dbsperf2 oracle]$/usr/lib/osa/bin/rm6
    bash: /usr/lib/osa/bin/rm6: bad interpreter: Permission denied
    [oracle@dbsperf2 oracle]$su
    # /usr/lib/osa/bin/rm6
    LVL 21

    Expert Comment


    Did you start this in an X-Windows session ?

    Author Comment

    Yes I started this in X-Windows session. I do get the admin console for managing. There are no apparent logs in the rmlog.log file. When I click on the 'Module Profile' button, I am able to see 'Drives', 'Luns', and 'Controllers' buttons; however only the 'LUNS' yield any result.
    Im not able to get any further deatils about the drives. Am I missing something?

    LVL 21

    Expert Comment


    From the rm6 app, choose "Recovery", then click on the "Recovery guru and health check" icon.


    Author Comment

    I recieve the following Result "Unable to Scan Module".

    I checked the following per the help items mentioned:
    Detailed Status

     Host: dbsperf2

     Module: E6500_001

     Affected Tray: Controller Tray

     Affected Components: Unable To Scan Module

     Controller A:  1T10305474 ( c3t5d0 )

     Probable Cause

     A failure occurred while attempting to obtain information about the RAID


     The Recovery Guru cannot detect any problems and your storage management
    applications will not function properly while this condition persists.

     Recovery Steps

     1. If a firmware download operation is in progress or the controller(s) in the
    RAID Modules are initializing after a reset (for example, the RAID Module was
    turned off and then on again), wait for these operations to complete and then
    re-run the Recovery Guru. If the problem persists, go to step 2.

     2. Try the following actions:
     - Run the command line utility "lad" to verify the controller(s) are visible
    to the storage management software.

    c3t5d0 1T10305474 LUNS: 0 1 2

     -  Run the appropriate 'clean' utility or reboot the operating system to fix
    any temporary problem with the storage management software.
    # ./clean
    Stopping array monitor daemon and all osa applications
    Array monitor daemon and all symsm applications stopped
    Cleaning the osa locks
    Locks successfully cleaned
    Restarting array monitor daemon

     - Check the 'System_MaxLunsPerController' parameter in the rmparams file to
    make sure that there are not too many LUNs existing on the controller(s) in
    this RAID Module (that is, more than your operating system can handle).

    <where is this located?>

    - Check and fix any unterminated cables (SCSI connection only).

    looks good.

    - Check for any problems with the adapter or drivers in your host system.

    <where can I look for this?> not notcing in the bin directory
    LVL 21

    Expert Comment


    Perhaps you should power down server and array and see if you can run diagnostics after power up.


    Author Comment

    I tried rebooting the machine already :( same issue persists upon power up.
    LVL 21

    Accepted Solution


    Did you actually power down the A1000 and server for at least a minute or so ? There's a difference between rebooting and powering down.

    >Check the 'System_MaxLunsPerController' parameter in the rmparams file

    This is located in  /usr/lib/osa/rmparams

    -What's the value for System_MaxLunsPerController in that file ?

    -Also check if you've got /usr/lib/osa/bin/

    -is this a production system, do you have a maintenance contract ?


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Maximize Your Threat Intelligence Reporting

    Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

    Windows 7 does not have the best desktop search built in. This is something Windows 7 users have struggled with. You type something in, and your search results don’t always match what you are looking for, or it doesn’t actually work at all. There ar…
    Create your own, high-performance VM backup appliance by installing NAKIVO Backup & Replication directly onto a Synology NAS!
    This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
    This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now