Solved

setting a hot spare with arcconf

Posted on 2013-01-28
16
1,923 Views
Last Modified: 2013-03-21
Hi,

I have a RAID 1E array on an Adaptec 5805 controller underneath a VMWare 4.1 server.  I got an error copying a file on the array this evening and running arcconf, found the following:

Logical device number 2
   Logical device name                      : BigVol
   RAID level                               : 1E
   Status of logical device                 : Degraded
   Size                                     : 1906676 MB
   Stripe-unit size                         : 256 KB
   Read-cache mode                          : Enabled
   MaxIQ preferred cache setting            : Enabled
   MaxIQ cache setting                      : Disabled
   Write-cache mode                         : Disabled (write-through)
   Write-cache setting                      : Disabled (write-through)
   Partitioned                              : Yes
   Protected by Hot-Spare                   : No
   Bootable                                 : No
   Failed stripes                           : No
   Power settings                           : Disabled
   --------------------------------------------------------
   Logical device segment information
   --------------------------------------------------------
   Segment 0                                : Present (0,1)             9VP5PGAE
   Segment 1                                : Inconsistent (0,2)             6VPAVNXX
   Segment 2                                : Present (0,3)      WD-WCAV5N519403
   Segment 3                                : Present (0,7)      WD-WCAV5P564765

I see this device isn't protected by a hot spare.  I have 2 spares installed on the controller(see attached configuration).  Can I use the arcconf utility to create a hot spare for this device?

Can I power off the drive with the server running and replace it?  'No' is an entirely acceptable answer. :)

Thanks!

--Ben
arcconfig.txt
0
Comment
Question by:Ben Conner
  • 8
  • 6
  • 2
16 Comments
 
LVL 5

Expert Comment

by:Eikroman
ID: 38829966
If the device is connected via backplane supporting hotswap, you can indeed replace the drive without shutting down the server.
If the drive is connected directly to the controller, I would not advise such action.


The command for setting up a hotspare for a specific drive look like:
ARCCONF SETSTATE 1 DEVICE 0 5 HSP LOGICALDRIVE 2

Open in new window

(on controller 1 set state for the device 0 5 to be hotspare for logical drive 2)


And since the drive is in inconsistent state but still online, you cold force the bad drive into failed state by issuing:
arcconf setstate 1 DEVICE 0 2 DDD

However, I seems to me, that there is a configuration problem. So I would not recommend any action, before you confirm the following:
First of all, both drives are already GLOBAL hostspares, so they need not be assigned anywhere. Then, the hotspare drive usually can not be a part of the logical drive. (To my knowledge).
Yet, both of your HotSpares (0,5 and 0,6) are part of logical drive 1 and 2 respectively.
I may speculate, that you have two other failed or removed drives in your system and those hotspares kicked in. So setting up a  0 5 as hotspare dedicated for LD2 will not work, as the drive already in use.

According to the config, you have a total of 7 physical drives connected to your controller.
Is that correct?

And then, as a side note, it seems that drive 0 5 will be failing as well as it has this:
         S.M.A.R.T.                         : Yes
         S.M.A.R.T. warnings                : 1912

Open in new window

0
 
LVL 117
ID: 38830064
I would change the hard drive whilst the server is running.

PLease also make sure you have valid backups of your VMs (not VMware Snapshots), before you make any RAID changes.
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38830099
I would change the hard drive whilst the server is running.

Even without hotswap backplane?
0
 

Author Comment

by:Ben Conner
ID: 38830811
Ouch.  Device 0 on cable 0,0 isn't even registering.  It is attached to the controller.  Would the controller see it if I had never defined it?

The good news is I have a pile of new drives ready to be put in place.  I just need to make sure this is done properly so I don't step in it up to my neck.

--Ben
0
 
LVL 117
ID: 38830856
the controller should see the device, unless it's become faulty.
0
 

Author Comment

by:Ben Conner
ID: 38830874
Footnote: I just issued a rescan and it didn't find device 0, so I will swap that out  tomorrow.  I won't be back in the office until then.

--Ben
0
 

Author Comment

by:Ben Conner
ID: 38833282
Before I got 20 minutes away from the office, the raid arrays failed this morning.  Spent some time on the phone with Adaptec support, who couldn't tell me why the failed drives didn't trigger the alarm on the card.  What's even better, one of the arrays (Raid 1E configuration) still thinks it's intact, so I'm copying VMs off that box onto the mirrored drives, once those had been rebuilt.  It's going to be a long night.

--Ben
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38833963
It's unfortunate. Although I was right about the previous failure, we didn't have time to resolve it.

Did all arrays failed?
And what kind of alert did you expect? We have a script that periodically monitors the state of the controller analyzing arcconf's output for the getconfig and getlogs command. If an inconsistency is found, the alert is then raised in our management system (we use Oracle Enterprise Manager) and whoever is responsible is notified.
We also have a sound alert activated on the controller, but we found it useless as 99% of time no one hears it before it mutes itself.

Let me know if you require any additional help.
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 

Author Comment

by:Ben Conner
ID: 38834274
Everything's turning to crap in my hands about now.  I had the vsn of arcconf installed on one of the VMs that can talk to the 5805 and that vm died.  And now I can't find the install package I used to install it.

Also, now when I define a new VM on a known good datastore, the drive isn't visible to the BIOS.  This isn't going to be pretty tomorrow.

Any suggestions?

--Ben
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38834372
ARCCONF utility is available with CIM provider from Adaptec. While sorting out hardware related problems, you may use locally installed ARCCONF.

For example, http://www.adaptec.com/en-us/speed/raid/storage_manager/cim_vmware_v7_00_18781_zip.htm

The zip contains arcconf for linux and windows.

Regarding the BIOS...
Can you confirm that in the NEW VM, it doesn't see its virtual drive in the BIOS?
Or during the installation of the OS?

If a VM doesn't see its own virtual disk in the BIOS, I would think it is corrupted or... the disk wasn't created. Recreate with defaults and see.

If it doesn't see the disk during the install of the OS, probably the OS type and/or controller type mismatch and there is no drivers for the controller in the os distrib. Change the controller for a recommended one on for the specific OS.
0
 

Author Comment

by:Ben Conner
ID: 38834959
Yes, the bios has no drives listed.  And when installing an OS, it can't see a drive, but I can see it in the configuration for that VM.  And I can see it was created when browsing the datastore.

Thanks for the link; was pulling the wrong executable down.  

Am running a verify-fix on the logical drive giving me fits.  Will see how that turns out.

--Ben
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38835031
Yes, the bios has no drives listed.  And when installing an OS, it can't see a drive, but I can see it in the configuration for that VM.  And I can see it was created when browsing the datastore.

I'm puzzled. I wasn't able to reproduce this behavior. Does recreating with typical settings helps?
0
 

Author Comment

by:Ben Conner
ID: 38846656
It didn't help using typical settings.  I created a new datastore from an iSCSI connection and was able to define the machine there.  I'm going to move off the rest of the VMs from that datastore and delete it, then take the drives out of circulation.

Thanks!

--Ben
0
 

Accepted Solution

by:
Ben Conner earned 0 total points
ID: 38923477
Hi,

I initially went to iSCSI and then decided to build another box and migrate everything to VMWare 5.1.  That project completed today, so I will break the old box down and do an extensive diagnostic on all the drives that were attached to it.

Had a long conversation with the tech folks at Adaptec and they were adept at using weasel words to explain why their controller didn't detect failing conditions on the drives.  I'll upgrade to the latest firmware, replace the drives, and build it back with 5.1.

Thanks for the help!  

--Ben
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38992183
I would not object, but the question seemed to evolve into a separate question which should have been asked in another thread.

Original question was about current configuration and the ability to assign a hot spare:

I see this device isn't protected by a hot spare.  I have 2 spares installed on the controller(see attached configuration).  Can I use the arcconf utility to create a hot spare for this device?

The requested command for arcconf was provided. In addition, the configuration log was analysed and additional problems pointed out. The course of actions was suggested. While the suggestion did not lead to happy end, it did highlight important problems thus the assisted solution to migrate to a new server.

While I'm doing EE stuff for my personal fun, from now then I'll be considering user's history of deleted questions as a criteria whether it is worth to participate.
0
 

Author Closing Comment

by:Ben Conner
ID: 39006319
The solutions provided didn't work, and after consulting Adaptec they admitted the controller firmware I was using couldn't report failing drives as advertised.  I've since migrated to a new box with a new LSI raid controller.
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

It Is not possible to enable LLDP in vSwitch(at least is not supported by VMware), so in this article we will enable this, and also go trough how to enabled CDP and how to get this information in vSwitches and also in vDS.
David Varnum recently wrote up his impressions of PRTG, based on a presentation by my colleague Christian at Tech Field Day at VMworld in Barcelona. Thanks David, for your detailed and honest evaluation!
Teach the user how to convert virtaul disk file formats and how to rename virtual machine files on datastores. Open vSphere Web Client: Review VM disk settings: Migrate VM to new datastore with a thick provisioned (lazy zeroed) disk format: Rename a…
Teach the user how to use create log bundles for vCenter Server or ESXi hosts Open vSphere Web Client: Generate vCenter Server and ESXi host log bundle:  Open vCenter Server Appliance Web Management interface and generate log bundle: Open vCenter Se…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now