Solved

setting a hot spare with arcconf

Posted on 2013-01-28
16
2,156 Views
Last Modified: 2013-03-21
Hi,

I have a RAID 1E array on an Adaptec 5805 controller underneath a VMWare 4.1 server.  I got an error copying a file on the array this evening and running arcconf, found the following:

Logical device number 2
   Logical device name                      : BigVol
   RAID level                               : 1E
   Status of logical device                 : Degraded
   Size                                     : 1906676 MB
   Stripe-unit size                         : 256 KB
   Read-cache mode                          : Enabled
   MaxIQ preferred cache setting            : Enabled
   MaxIQ cache setting                      : Disabled
   Write-cache mode                         : Disabled (write-through)
   Write-cache setting                      : Disabled (write-through)
   Partitioned                              : Yes
   Protected by Hot-Spare                   : No
   Bootable                                 : No
   Failed stripes                           : No
   Power settings                           : Disabled
   --------------------------------------------------------
   Logical device segment information
   --------------------------------------------------------
   Segment 0                                : Present (0,1)             9VP5PGAE
   Segment 1                                : Inconsistent (0,2)             6VPAVNXX
   Segment 2                                : Present (0,3)      WD-WCAV5N519403
   Segment 3                                : Present (0,7)      WD-WCAV5P564765

I see this device isn't protected by a hot spare.  I have 2 spares installed on the controller(see attached configuration).  Can I use the arcconf utility to create a hot spare for this device?

Can I power off the drive with the server running and replace it?  'No' is an entirely acceptable answer. :)

Thanks!

--Ben
arcconfig.txt
0
Comment
Question by:Ben Conner
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 6
  • 2
16 Comments
 
LVL 5

Expert Comment

by:Eikroman
ID: 38829966
If the device is connected via backplane supporting hotswap, you can indeed replace the drive without shutting down the server.
If the drive is connected directly to the controller, I would not advise such action.


The command for setting up a hotspare for a specific drive look like:
ARCCONF SETSTATE 1 DEVICE 0 5 HSP LOGICALDRIVE 2

Open in new window

(on controller 1 set state for the device 0 5 to be hotspare for logical drive 2)


And since the drive is in inconsistent state but still online, you cold force the bad drive into failed state by issuing:
arcconf setstate 1 DEVICE 0 2 DDD

However, I seems to me, that there is a configuration problem. So I would not recommend any action, before you confirm the following:
First of all, both drives are already GLOBAL hostspares, so they need not be assigned anywhere. Then, the hotspare drive usually can not be a part of the logical drive. (To my knowledge).
Yet, both of your HotSpares (0,5 and 0,6) are part of logical drive 1 and 2 respectively.
I may speculate, that you have two other failed or removed drives in your system and those hotspares kicked in. So setting up a  0 5 as hotspare dedicated for LD2 will not work, as the drive already in use.

According to the config, you have a total of 7 physical drives connected to your controller.
Is that correct?

And then, as a side note, it seems that drive 0 5 will be failing as well as it has this:
         S.M.A.R.T.                         : Yes
         S.M.A.R.T. warnings                : 1912

Open in new window

0
 
LVL 120
ID: 38830064
I would change the hard drive whilst the server is running.

PLease also make sure you have valid backups of your VMs (not VMware Snapshots), before you make any RAID changes.
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38830099
I would change the hard drive whilst the server is running.

Even without hotswap backplane?
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 

Author Comment

by:Ben Conner
ID: 38830811
Ouch.  Device 0 on cable 0,0 isn't even registering.  It is attached to the controller.  Would the controller see it if I had never defined it?

The good news is I have a pile of new drives ready to be put in place.  I just need to make sure this is done properly so I don't step in it up to my neck.

--Ben
0
 
LVL 120
ID: 38830856
the controller should see the device, unless it's become faulty.
0
 

Author Comment

by:Ben Conner
ID: 38830874
Footnote: I just issued a rescan and it didn't find device 0, so I will swap that out  tomorrow.  I won't be back in the office until then.

--Ben
0
 

Author Comment

by:Ben Conner
ID: 38833282
Before I got 20 minutes away from the office, the raid arrays failed this morning.  Spent some time on the phone with Adaptec support, who couldn't tell me why the failed drives didn't trigger the alarm on the card.  What's even better, one of the arrays (Raid 1E configuration) still thinks it's intact, so I'm copying VMs off that box onto the mirrored drives, once those had been rebuilt.  It's going to be a long night.

--Ben
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38833963
It's unfortunate. Although I was right about the previous failure, we didn't have time to resolve it.

Did all arrays failed?
And what kind of alert did you expect? We have a script that periodically monitors the state of the controller analyzing arcconf's output for the getconfig and getlogs command. If an inconsistency is found, the alert is then raised in our management system (we use Oracle Enterprise Manager) and whoever is responsible is notified.
We also have a sound alert activated on the controller, but we found it useless as 99% of time no one hears it before it mutes itself.

Let me know if you require any additional help.
0
 

Author Comment

by:Ben Conner
ID: 38834274
Everything's turning to crap in my hands about now.  I had the vsn of arcconf installed on one of the VMs that can talk to the 5805 and that vm died.  And now I can't find the install package I used to install it.

Also, now when I define a new VM on a known good datastore, the drive isn't visible to the BIOS.  This isn't going to be pretty tomorrow.

Any suggestions?

--Ben
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38834372
ARCCONF utility is available with CIM provider from Adaptec. While sorting out hardware related problems, you may use locally installed ARCCONF.

For example, http://www.adaptec.com/en-us/speed/raid/storage_manager/cim_vmware_v7_00_18781_zip.htm

The zip contains arcconf for linux and windows.

Regarding the BIOS...
Can you confirm that in the NEW VM, it doesn't see its virtual drive in the BIOS?
Or during the installation of the OS?

If a VM doesn't see its own virtual disk in the BIOS, I would think it is corrupted or... the disk wasn't created. Recreate with defaults and see.

If it doesn't see the disk during the install of the OS, probably the OS type and/or controller type mismatch and there is no drivers for the controller in the os distrib. Change the controller for a recommended one on for the specific OS.
0
 

Author Comment

by:Ben Conner
ID: 38834959
Yes, the bios has no drives listed.  And when installing an OS, it can't see a drive, but I can see it in the configuration for that VM.  And I can see it was created when browsing the datastore.

Thanks for the link; was pulling the wrong executable down.  

Am running a verify-fix on the logical drive giving me fits.  Will see how that turns out.

--Ben
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38835031
Yes, the bios has no drives listed.  And when installing an OS, it can't see a drive, but I can see it in the configuration for that VM.  And I can see it was created when browsing the datastore.

I'm puzzled. I wasn't able to reproduce this behavior. Does recreating with typical settings helps?
0
 

Author Comment

by:Ben Conner
ID: 38846656
It didn't help using typical settings.  I created a new datastore from an iSCSI connection and was able to define the machine there.  I'm going to move off the rest of the VMs from that datastore and delete it, then take the drives out of circulation.

Thanks!

--Ben
0
 

Accepted Solution

by:
Ben Conner earned 0 total points
ID: 38923477
Hi,

I initially went to iSCSI and then decided to build another box and migrate everything to VMWare 5.1.  That project completed today, so I will break the old box down and do an extensive diagnostic on all the drives that were attached to it.

Had a long conversation with the tech folks at Adaptec and they were adept at using weasel words to explain why their controller didn't detect failing conditions on the drives.  I'll upgrade to the latest firmware, replace the drives, and build it back with 5.1.

Thanks for the help!  

--Ben
0
 
LVL 5

Expert Comment

by:Eikroman
ID: 38992183
I would not object, but the question seemed to evolve into a separate question which should have been asked in another thread.

Original question was about current configuration and the ability to assign a hot spare:

I see this device isn't protected by a hot spare.  I have 2 spares installed on the controller(see attached configuration).  Can I use the arcconf utility to create a hot spare for this device?

The requested command for arcconf was provided. In addition, the configuration log was analysed and additional problems pointed out. The course of actions was suggested. While the suggestion did not lead to happy end, it did highlight important problems thus the assisted solution to migrate to a new server.

While I'm doing EE stuff for my personal fun, from now then I'll be considering user's history of deleted questions as a criteria whether it is worth to participate.
0
 

Author Closing Comment

by:Ben Conner
ID: 39006319
The solutions provided didn't work, and after consulting Adaptec they admitted the controller firmware I was using couldn't report failing drives as advertised.  I've since migrated to a new box with a new LSI raid controller.
0

Featured Post

Create the perfect environment for any meeting

You might have a modern environment with all sorts of high-tech equipment, but what makes it worthwhile is how you seamlessly bring together the presentation with audio, video and lighting. The ATEN Control System provides integrated control and system automation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

HOW TO: Install and Configure VMware vSphere Hypervisor 6.5 (ESXi 6.5), Step by Step Tutorial with screenshots. From Download, Checking Media, to Completed Installation.
This article outlines why you need to choose a backup solution that protects your entire environment – including your VMware ESXi and Microsoft Hyper-V virtualization hosts – not just your virtual machines.
Teach the user how to edit .vmx files to add advanced configuration options Open vSphere Web Client: Edit Settings for a VM: Choose VM Options -> Advanced: Add Configuration Parameters:
This video shows you how to use a vSphere client to connect to your ESX host as the root user. Demonstrates the basic connection of bypassing certification set up. Demonstrates how to access the traditional view to begin managing your virtual mac…

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question