setting a hot spare with arcconf

Hi,

I have a RAID 1E array on an Adaptec 5805 controller underneath a VMWare 4.1 server.  I got an error copying a file on the array this evening and running arcconf, found the following:

Logical device number 2
   Logical device name                      : BigVol
   RAID level                               : 1E
   Status of logical device                 : Degraded
   Size                                     : 1906676 MB
   Stripe-unit size                         : 256 KB
   Read-cache mode                          : Enabled
   MaxIQ preferred cache setting            : Enabled
   MaxIQ cache setting                      : Disabled
   Write-cache mode                         : Disabled (write-through)
   Write-cache setting                      : Disabled (write-through)
   Partitioned                              : Yes
   Protected by Hot-Spare                   : No
   Bootable                                 : No
   Failed stripes                           : No
   Power settings                           : Disabled
   --------------------------------------------------------
   Logical device segment information
   --------------------------------------------------------
   Segment 0                                : Present (0,1)             9VP5PGAE
   Segment 1                                : Inconsistent (0,2)             6VPAVNXX
   Segment 2                                : Present (0,3)      WD-WCAV5N519403
   Segment 3                                : Present (0,7)      WD-WCAV5P564765

I see this device isn't protected by a hot spare.  I have 2 spares installed on the controller(see attached configuration).  Can I use the arcconf utility to create a hot spare for this device?

Can I power off the drive with the server running and replace it?  'No' is an entirely acceptable answer. :)

Thanks!

--Ben
arcconfig.txt
Ben ConnerCTO, SAS developerAsked:
Who is Participating?
 
Ben ConnerConnect With a Mentor CTO, SAS developerAuthor Commented:
Hi,

I initially went to iSCSI and then decided to build another box and migrate everything to VMWare 5.1.  That project completed today, so I will break the old box down and do an extensive diagnostic on all the drives that were attached to it.

Had a long conversation with the tech folks at Adaptec and they were adept at using weasel words to explain why their controller didn't detect failing conditions on the drives.  I'll upgrade to the latest firmware, replace the drives, and build it back with 5.1.

Thanks for the help!  

--Ben
0
 
EikromanCommented:
If the device is connected via backplane supporting hotswap, you can indeed replace the drive without shutting down the server.
If the drive is connected directly to the controller, I would not advise such action.


The command for setting up a hotspare for a specific drive look like:
ARCCONF SETSTATE 1 DEVICE 0 5 HSP LOGICALDRIVE 2

Open in new window

(on controller 1 set state for the device 0 5 to be hotspare for logical drive 2)


And since the drive is in inconsistent state but still online, you cold force the bad drive into failed state by issuing:
arcconf setstate 1 DEVICE 0 2 DDD

However, I seems to me, that there is a configuration problem. So I would not recommend any action, before you confirm the following:
First of all, both drives are already GLOBAL hostspares, so they need not be assigned anywhere. Then, the hotspare drive usually can not be a part of the logical drive. (To my knowledge).
Yet, both of your HotSpares (0,5 and 0,6) are part of logical drive 1 and 2 respectively.
I may speculate, that you have two other failed or removed drives in your system and those hotspares kicked in. So setting up a  0 5 as hotspare dedicated for LD2 will not work, as the drive already in use.

According to the config, you have a total of 7 physical drives connected to your controller.
Is that correct?

And then, as a side note, it seems that drive 0 5 will be failing as well as it has this:
         S.M.A.R.T.                         : Yes
         S.M.A.R.T. warnings                : 1912

Open in new window

0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
I would change the hard drive whilst the server is running.

PLease also make sure you have valid backups of your VMs (not VMware Snapshots), before you make any RAID changes.
0
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

 
EikromanCommented:
I would change the hard drive whilst the server is running.

Even without hotswap backplane?
0
 
Ben ConnerCTO, SAS developerAuthor Commented:
Ouch.  Device 0 on cable 0,0 isn't even registering.  It is attached to the controller.  Would the controller see it if I had never defined it?

The good news is I have a pile of new drives ready to be put in place.  I just need to make sure this is done properly so I don't step in it up to my neck.

--Ben
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
the controller should see the device, unless it's become faulty.
0
 
Ben ConnerCTO, SAS developerAuthor Commented:
Footnote: I just issued a rescan and it didn't find device 0, so I will swap that out  tomorrow.  I won't be back in the office until then.

--Ben
0
 
Ben ConnerCTO, SAS developerAuthor Commented:
Before I got 20 minutes away from the office, the raid arrays failed this morning.  Spent some time on the phone with Adaptec support, who couldn't tell me why the failed drives didn't trigger the alarm on the card.  What's even better, one of the arrays (Raid 1E configuration) still thinks it's intact, so I'm copying VMs off that box onto the mirrored drives, once those had been rebuilt.  It's going to be a long night.

--Ben
0
 
EikromanCommented:
It's unfortunate. Although I was right about the previous failure, we didn't have time to resolve it.

Did all arrays failed?
And what kind of alert did you expect? We have a script that periodically monitors the state of the controller analyzing arcconf's output for the getconfig and getlogs command. If an inconsistency is found, the alert is then raised in our management system (we use Oracle Enterprise Manager) and whoever is responsible is notified.
We also have a sound alert activated on the controller, but we found it useless as 99% of time no one hears it before it mutes itself.

Let me know if you require any additional help.
0
 
Ben ConnerCTO, SAS developerAuthor Commented:
Everything's turning to crap in my hands about now.  I had the vsn of arcconf installed on one of the VMs that can talk to the 5805 and that vm died.  And now I can't find the install package I used to install it.

Also, now when I define a new VM on a known good datastore, the drive isn't visible to the BIOS.  This isn't going to be pretty tomorrow.

Any suggestions?

--Ben
0
 
EikromanCommented:
ARCCONF utility is available with CIM provider from Adaptec. While sorting out hardware related problems, you may use locally installed ARCCONF.

For example, http://www.adaptec.com/en-us/speed/raid/storage_manager/cim_vmware_v7_00_18781_zip.htm

The zip contains arcconf for linux and windows.

Regarding the BIOS...
Can you confirm that in the NEW VM, it doesn't see its virtual drive in the BIOS?
Or during the installation of the OS?

If a VM doesn't see its own virtual disk in the BIOS, I would think it is corrupted or... the disk wasn't created. Recreate with defaults and see.

If it doesn't see the disk during the install of the OS, probably the OS type and/or controller type mismatch and there is no drivers for the controller in the os distrib. Change the controller for a recommended one on for the specific OS.
0
 
Ben ConnerCTO, SAS developerAuthor Commented:
Yes, the bios has no drives listed.  And when installing an OS, it can't see a drive, but I can see it in the configuration for that VM.  And I can see it was created when browsing the datastore.

Thanks for the link; was pulling the wrong executable down.  

Am running a verify-fix on the logical drive giving me fits.  Will see how that turns out.

--Ben
0
 
EikromanCommented:
Yes, the bios has no drives listed.  And when installing an OS, it can't see a drive, but I can see it in the configuration for that VM.  And I can see it was created when browsing the datastore.

I'm puzzled. I wasn't able to reproduce this behavior. Does recreating with typical settings helps?
0
 
Ben ConnerCTO, SAS developerAuthor Commented:
It didn't help using typical settings.  I created a new datastore from an iSCSI connection and was able to define the machine there.  I'm going to move off the rest of the VMs from that datastore and delete it, then take the drives out of circulation.

Thanks!

--Ben
0
 
EikromanCommented:
I would not object, but the question seemed to evolve into a separate question which should have been asked in another thread.

Original question was about current configuration and the ability to assign a hot spare:

I see this device isn't protected by a hot spare.  I have 2 spares installed on the controller(see attached configuration).  Can I use the arcconf utility to create a hot spare for this device?

The requested command for arcconf was provided. In addition, the configuration log was analysed and additional problems pointed out. The course of actions was suggested. While the suggestion did not lead to happy end, it did highlight important problems thus the assisted solution to migrate to a new server.

While I'm doing EE stuff for my personal fun, from now then I'll be considering user's history of deleted questions as a criteria whether it is worth to participate.
0
 
Ben ConnerCTO, SAS developerAuthor Commented:
The solutions provided didn't work, and after consulting Adaptec they admitted the controller firmware I was using couldn't report failing drives as advertised.  I've since migrated to a new box with a new LSI raid controller.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.