Dell MD1200 (R810) Running Slow after Bad Drive Replaced..Hot Spare 2 Greens

bhunger
bhunger used Ask the Experts™
on
Server is R810 with four MD1200's attached.  Oracle db on it.  System had been running fine until last Thursday when it signaled a bad drive on the first md1200. We hot swapped it out and the rebuild appeared to go normally.  However, enen though its led light is normal green and the system states that it's online, performance is noticeably slower.  Also, I noticed that the led's on the hot spare of that md1200 are BOTH solid green while the other three md1200's only have the top led green (and the bottom led is off).  Does this suggest that it's still in some kind of rebuild almost a week later? CPU, RAM and network all seem to check out.  Also, non Oracle jobs are slow, so probably not Oracle.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2014

Commented:
Are they connected to a PERC or to a MD3xxx?

Either way you need to look in OMSA (or MDSM if connected to MD3xxx) to see the status rather than look at the LEDs.

Author

Commented:
It's a Dell H800.  We're running Solaris 10 on the server.  Can we even install DSM on this platform?
Top Expert 2014

Commented:
Oh yes, this is the one you have to use Megacli on as no GUI.

Can you post MegaCli -PDInfo -PhysDrv -aAll

And also check the battery status and attach the log so I can skim through it.
https://www.dell.com/support/article/uk/en/ukdhs1/sln292232/extracting-the-raid-controller-logs-via-megacli?lang=en
Starting with Angular 5

Learn the essential features and functions of the popular JavaScript framework for building mobile, desktop and web applications.

Author

Commented:
Here's the MegaCli output.  I couldn't run it with the exact parameters that you supplied, but 100% of the drives on both controllers are showing "write cache disabled."  This, in spite of the fact that we checked the config parameter within the H800 (and the other controller, H700) which basically said that it would NOT disable the write cache, even if the battery was dead. (System is in a data center with two electrical whips, on two separate power feeds.  Is there some system wide override working here?
megacli_20190619.txt
Top Expert 2014

Commented:
Write cache disabled is normal for the disks, it's not the controller's battery backed cache but the disk's own write cache.

Author

Commented:
Okay, thanks. I was hoping that that was the source of the sluggishness.  Still trying to figure it out then.
Top Expert 2014

Commented:
The TTY log will show the disk failures and replacements, might be worth posting although it's quite a bit to read through.

Author

Commented:
Back at data center. Server still really slow. R810 LCD message, "controller battery failed."  This after we replaced last night and received message, "...charging battery next 24 hours."

Again, in the H800 bios, we checked to write back cache regardless of battery state.

Also, hot spare drive (slot 11) is still showing two solid greens rather than top green and bottom un-illuminated (like the other three working Md1200's).
Top Expert 2014

Commented:
The LCD of an R810 is tuned to display errors from the mobo and default (onboard) storage controller, it's not driven by additional controllers that have external enclosures connected to them. Think your megacli output is from wrong controller.
Commented:
Finally fixed issue.  It took a replacement of the H800 controller. Now, performance is back to normal and write cache is apparently working normally. Thanks.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial