Raid5 server

eaglerod
eaglerod used Ask the Experts™
on
I have a raid 5 server that is constantly going to scandisk and stopping at 34%. I can see all 3 hard drives including the spare on the highpoint 1720 controller. It says that my raid volume is critical. What do i do to fix it?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
You'll need to go into the raid controller setup (Setup is usually accessible when the raid controller is initialising at boot). That should list any errors with the disks, if it's critical it sounds like one (or more) of the disks in the RAID may have failed, or there's a predictive failure.
I have been told to never, ever run scandisk/fix errors from Windows. bedind is correct. Use the raid controller BIOS. Go by what it is telling you. In there you should also find something on the order of scandisk but you want to use the one on the raid controller BIOS not the one from Windows,
President
Top Expert 2010
Commented:
First - TAKE A FULL BACKUP.  NOW.  DO A FULL BACKUP SUITABLE FOR BARE METAL RESTORE.

You have not only a critical system (failed drive), but also you have a large number of unrecoverable read errors.  You have data corruption, and partial data loss, and filesystem corruption.

There is no way to walk somebody recover from this multiple failure scenarios, especially with that crappy controller.  You will also need to replace at least 2 disk drives.  The one that failed, and at least one of the remaining disks is in deep recovery and has a large number of bad blocks.

Since you only have 3 disks, and 2 are known bad ... go shopping for 2 new disks while it is backing up, and be prepared for a bare-metal restore.

(TO walk you through recovery you would need some commercial software, a binary editor, and some hex dumps, and somebody with lots of free time.)

I would dump the highpoint, get 2 decent disks and use native RAID1 (host based mirroring).  It will be profoundly faster.

So once backup is finished, yank the old drives & controller, then restore to a single disk, convert it to dynamic, then let windows turn it into a RAID1.
How to Generate Services Revenue the Easiest Way

This Tuesday! Learn key insights about modern cyber protection services & gain practical strategies to skyrocket business:

- What it takes to build a cloud service portfolio
- How to determine which services will help your unique business grow
- Various use-cases and examples

I must be missing something. I don't see anywhere above where there was a mention of 2 of the three hard drives being bad. I did see the mention of a spare. It it is running I would first go in to the software that came with the controller and should be running on the computer. It should tell you pretty much what is wrong with the controller and how to fix it. It could be something simple like you need to invoke the spare to replace a defective drive. It could simple tell you you need to replave one drive and do a rebuild. Just the fact that the computer is still running would imply that the raid is repairable.

   Either use the utility software that came with the controller to diagnose the problem or enter the controller firmware at boot to diagnose the problem. Either way it should explicitly tell you what the problem is and give you a real good idea how to fix it.
DavidPresident
Top Expert 2010

Commented:
Jimbecher:  Here is the logic ...

One drive is critical - that is a given since the 3-drive array is degraded

The system is seemingly locking up at 38%, so that means it is in deep recovery.  Since only 2 disks are active in the RAID, then at least one of the 2 remaining disks has so many bad blocks that it is locking up.

There is nothing you can fix.  THe data on the down drive is stale, and since there is no redundancy there is nothing you can do.  There is nothing wrong with the controller.  If the logical volume is degraded, and it can't read from any of the surviving blocks, it locks up.
Top Expert 2014

Commented:
To clarify the point of running scandisk etc on a RAID array this is fine if the array is optimal and you want to fix minor filesystem corruptions but it is very ill advised to run it on a degraded array. As dlethe mentions with one disk down the controller may successfully get the data off a flakey block by retrying a few times so if the failed disk was replaced the array may rebuild successfully and all would eventually be good. Run scandisk before the array is rebuilt however and the controller has to report to the OS that the sector is bad, the OS will tell the controller to retry and eventually give up and write zeros in a replacement sector and flag so data that could be recovered by the controller gets overwritten by the OS.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial