Solved

Problems with Microsoft Server 2003 R2 after RAID5 trouble

Posted on 2010-09-02
14
2,514 Views
Last Modified: 2012-05-10
In my HP DL380 server I have RAID5 with hotspare 3+1 - recently one of working disk crashed - i rebooted the server and activated hot spare. RAID synchronized after few hours, and everything seemed to be working, but there are some performance problems.
Perfmon shows graphs like  .|.|.|.|.| ( 0 to 100%) on disk queue when MSSQL is "waiking up" for example when someone launches app for the first time or when using Sharepoint. After 5-15 minutes it stops peaking and stays at normal low level, but those peaks keep appearing several times during day.
The second problem is that after this crash HP Backup Software ( Backup Express ) is showing errors and it cannot backup files on drive C: .
Today even DHCP service had some problems.

Is it safe to run chkdsk on running server and then maybe sfc /scannow ?
Or maybe someone knows what happened?

Logs show nothing
.

0
Comment
Question by:shlafrock
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
  • 2
  • +1
14 Comments
 
LVL 12

Expert Comment

by:mlongoh
ID: 33590877
Why did you have to reboot the server?  If you had a RAID5 array with a hot spare, the spare should have kicked in to play when the drive failed without really impacting much except performance during the sync of the spare drive.
0
 
LVL 88

Expert Comment

by:rindi
ID: 33591073
chkdsk should be run while the partition you are checking is offline, that means it is done while booting. During this time the server won't be usable, and it can take a lot of time depending on the partition size and number of errors it finds. But a chkdsk should nevertheless be done regularly. SFC /Scannow should be no problem.
0
 
LVL 21

Expert Comment

by:mastoo
ID: 33591384
Yes, why the reboot?  Maybe the drive failure was a controller error?  I'd check your Sql and windows event logs to see if they are still complaining about drive problems.
0
Does Your Cloud Backup Use Blockchain Technology?

Blockchain technology has already revolutionized finance thanks to Bitcoin. Now it's disrupting other areas, including the realm of data protection. Learn how blockchain is now being used to authenticate backup files and keep them safe from hackers.

 
LVL 1

Author Comment

by:shlafrock
ID: 33592101
I don't know why hot spare did not activate itself when the drive failed - in HP Managment logs there was Drive needs to be replaced and the led was orange, so I rebooted the server and BIOS said the drive is unoperational or something but it still worked without hot spare, so I turned off the server, pulled out hdd and turned server on - BIOS said Do you want to activate hot spare - i clicked Y .
I've never been dealing with hot spare activation till now. Of course I thought it would activate automagically and everything would be working fine but it didn't.

Logs show nothing.
I can run chkdsk only on Sunday.
0
 
LVL 12

Expert Comment

by:mlongoh
ID: 33598588
You say the Backup Express and DCHP are generating errors, what are the errors?  Are they showing in the event log?
0
 
LVL 1

Author Comment

by:shlafrock
ID: 33602504
Data Protector Express error is i think : Invalid object ID
theres nothing in event log.
I'll post tommorow after I run chkdsk
0
 
LVL 1

Author Comment

by:shlafrock
ID: 33625143
Ok so chkdsk showed nothing.
I repleaced the drive - this time while server was on - sync started and finished after an hour or more, logs show that everything is ok, so I think everythink is fine now, except when I reboot server I see :

Slot 1 HP Smart Array P400 Controller
1716-Slot 1 Drive Array - Unregenerable Media Errors Detected on Drives
during previous Rebuild or Auto-Reliability Monitoring ( ARM) scan.
Problem will be fixed automatically when the sector(s) are overwritten.
Backup and Restore recommended.

but in HP Managment app I see that everything is fine, no problems, everything works.
The HP DL380 UID LED is constantyly blinking what is annoying, and I cannot disable this blinking because of the error above I think.


0
 
LVL 12

Expert Comment

by:mlongoh
ID: 33626170
Just a guess, but it sounds like the drive that was most recently replaced has one or more bad sectors and that they were marked bad, or you may have a faulty controller (given that you've had problem with two different drives).
0
 
LVL 1

Author Comment

by:shlafrock
ID: 33626375
But logs show that everything is allright.
0
 
LVL 12

Accepted Solution

by:
mlongoh earned 500 total points
ID: 33626474
You get an error message every time you reboot the server (presumably when the RAID controller is initializing) and there's a fault error light?  There are a number of possiblities - the top three are some bad sectors, bad drive, or bad controller.  I'm hoping it's just some bad sectors and that once the sectors have been overwritten (I'm presuming they've been re-mapped already) that the light will go out and the message will discontinue.

Have you attempted to use the management interface that's available at boot up to see if you can clear the error or investigate further?  Usually (can't state for certain) there's a function/hot key when the controller is inititializing that allows you to do some basic/simple raid managmement.

Is the server still under warranty?
0
 
LVL 1

Author Comment

by:shlafrock
ID: 33626567
I think I get it every time I reboot, I'll check this out next sunday.
0
 
LVL 88

Expert Comment

by:rindi
ID: 33626804
Usually if SATA-II isn't specifically mentioned anywhere, it's likely that only SATA-I is supported (as at that time there was only one SATA standard out, so there was no point in further specifying the version). Also, having looked at the OS support for this PC (when downloading drivers you can select the OS), it looks like it isn't the newest systems, so it is quite possible that it only uses SATA-I.
0
 
LVL 1

Author Closing Comment

by:shlafrock
ID: 33796966
.
0
 
LVL 12

Expert Comment

by:mlongoh
ID: 33797566
So what happened?
0

Featured Post

[Live Webinar] The Cloud Skills Gap

As Cloud technologies come of age, business leaders grapple with the impact it has on their team's skills and the gap associated with the use of a cloud platform.

Join experts from 451 Research and Concerto Cloud Services on July 27th where we will examine fact and fiction.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Restoring deleted objects in Active Directory has been a standard feature in Active Directory for many years, yet some admins may not know what is available.
Compliance and data security require steps be taken to prevent unauthorized users from copying data.  Here's one method to prevent data theft via USB drives (and writable optical media).
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

634 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question