Avatar of eaglerod
eaglerod
Flag for United States of America asked on

Raid5 server

I have a raid 5 server that is constantly going to scandisk and stopping at 34%. I can see all 3 hard drives including the spare on the highpoint 1720 controller. It says that my raid volume is critical. What do i do to fix it?
Windows Server 2003Storage HardwareServer Hardware

Avatar of undefined
Last Comment
andyalder

8/22/2022 - Mon
bedind

You'll need to go into the raid controller setup (Setup is usually accessible when the raid controller is initialising at boot). That should list any errors with the disks, if it's critical it sounds like one (or more) of the disks in the RAID may have failed, or there's a predictive failure.
jimbecher

I have been told to never, ever run scandisk/fix errors from Windows. bedind is correct. Use the raid controller BIOS. Go by what it is telling you. In there you should also find something on the order of scandisk but you want to use the one on the raid controller BIOS not the one from Windows,
ASKER CERTIFIED SOLUTION
David

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
jimbecher

I must be missing something. I don't see anywhere above where there was a mention of 2 of the three hard drives being bad. I did see the mention of a spare. It it is running I would first go in to the software that came with the controller and should be running on the computer. It should tell you pretty much what is wrong with the controller and how to fix it. It could be something simple like you need to invoke the spare to replace a defective drive. It could simple tell you you need to replave one drive and do a rebuild. Just the fact that the computer is still running would imply that the raid is repairable.

   Either use the utility software that came with the controller to diagnose the problem or enter the controller firmware at boot to diagnose the problem. Either way it should explicitly tell you what the problem is and give you a real good idea how to fix it.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
David

Jimbecher:  Here is the logic ...

One drive is critical - that is a given since the 3-drive array is degraded

The system is seemingly locking up at 38%, so that means it is in deep recovery.  Since only 2 disks are active in the RAID, then at least one of the 2 remaining disks has so many bad blocks that it is locking up.

There is nothing you can fix.  THe data on the down drive is stale, and since there is no redundancy there is nothing you can do.  There is nothing wrong with the controller.  If the logical volume is degraded, and it can't read from any of the surviving blocks, it locks up.
andyalder

To clarify the point of running scandisk etc on a RAID array this is fine if the array is optimal and you want to fix minor filesystem corruptions but it is very ill advised to run it on a degraded array. As dlethe mentions with one disk down the controller may successfully get the data off a flakey block by retrying a few times so if the failed disk was replaced the array may rebuild successfully and all would eventually be good. Run scandisk before the array is rebuilt however and the controller has to report to the OS that the sector is bad, the OS will tell the controller to retry and eventually give up and write zeros in a replacement sector and flag so data that could be recovered by the controller gets overwritten by the OS.