Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


Highpoint RAID 404 Crashes During Verify

Posted on 2006-06-16
Medium Priority
Last Modified: 2012-08-13
Hey guys (and some gals),

We've got a serious problem. Our file server has been crashing while verifying its array. It doesn't seem to be inconsistent, or lose any data. It just crashes almost every time - three times today!

The server used to kick disks out randomly during verification, about once every two or three weeks, but that all but stopped when I upgraded from 2-disk RAID 1 to 4-disk RAID 1/0. After the upgrade, the RAID controller didn't kick a disk, or crash while verifying, for about six months.

We moved to a new office, and the file server got physically relocated three or four times while we were settling in and getting our floor plan hashed out. I think maybe the bumps and dings started this round of crash problems, but all the cables and cards are seated properly and firmly. We use all Highpoint ATA cables and an X-Connect power supply.

Since the move, the computer crashes about 85% of the time when verifying the array, usually about 30m-1hr into the verification process. It hasn't really been kicking the disks out except for one time three weeks ago when it kicked out !three of the four! disks. Now that was a fiasco. Luckily we have a robust backup policy and things were recovered (relatively) smoothly.

It used to give me video card errors, so I replaced the video card with something older, less adventurous, and presumably more stable. Now the errors don't refer to a video card driver anymore, they just say "The system has recovered from a serious error".

I don't believe it's a processor, motherboard, or memory problem because we have never had a crash that I am aware of that wasn't during the verification process. It's not the disk drives - even the ones that get kicked out are always fine. No clicks, no excess heat, no SMART errors. All four disks have been cycled through the system over the last month because of this problem, trying to see if a specific disk was responsible.

Our system is:
P4 2.8
1 gig Crucial
Intel D865PERLK
Highpoint RocketRAID 404 Controller
4x 200 Seagate ATA
WinXP Pro SP2 fully updated
2x 200mm Antec Quiet Fans (given to show that we do have adequate cooling)
Zalman Copper 92mm CPU fan (" ")

AVG Antivirus
AllSync Scheduler
PowerClock Server
BOINC - Seti@Home
HPT Service Manager
Therapist Helper Server
WinAmp (waiting room music)
Highpoint RAID Management Console

I think the problem might be related to the fact that the HPT cards offset the XOR routine to the processor. BOINC runs Seti@Home while the system is otherwise inactive, so I wonder if the XOR offset might not collide with the SETI processes, but this seems a little out there.

Can anyone suggest a method to properly troubleshoot what actually causes the crash, how to stop the crashing, or, as a last resort, a known good and stable ATA RAID controller with eight channels and a reasonable price?

I can't seem to find ANY reviews of RAID controllers that have a review period of longer (DAMN COMPUTER! Just crashed again right now while verifying) than a week, and we all know that a week or two is nowhere NEAR long enough to assess the capabilities of a RAID card for long-term reliability. It's like assessing a new car model for reliability by glancing at the interior in a magazine spread.

So I guess this is a multi-pronged request - troubleshoot, fix, or suggest a replacement that is compatible and known good.

Question by:slbriggsphd
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2

Author Comment

ID: 16923879
Just checked all the components - no excess heat on the CPU, northbridge, memory, video card, disk drives, or RAID controller. Nothing is more than slightly warm to the touch.
LVL 44

Accepted Solution

scrathcyboy earned 690 total points
ID: 16925647
Clearly, the controller or the motherboard is going.  You should not get crashes trying to verify the array.

1.  First is to look for a BIOS update for the raid controller, of course from the MFGs website.

2.  If installing the BIOS does not fix the problem, go over the RAID settings once again, as I am sure you did.

3.  If you are determined to keep this controller, move it and the drives to a different motherboard.  That might solve the problem right there, the IRQ line on the MB might be unstable.

4.  If that does not work, then suspect the controller card.  Of course, you know you will have to backup all the data.  The best are promise RAID controllers, or Highpoint 370, both very reliable.

5.  Problem is, everyone is going now to SATA raid controllers, so if you are looking for a future solution, you are stuck with SATA, which means a whole new drive array, and this is expensive.
LVL 88

Expert Comment

ID: 16926898
I haven't had all that good experiences with highpoint raid controllers. I find the promise raid cards are much more reliable, or even better are the 3ware ones, but highpoint seems to justify it's cheapness with low quality. If a firmware update as suggested by scrathy doesn't help, get more reliable raid cards.
[Video] Create a Disruption-Free Workspace

Open offices have their challenges. And Sometimes, it's even hard to work at work. It's time to reclaim your office and create a disruption-free workspace. With the MB 660, you can:

-Increase Concentration
-Improve well-being
-Boost Productivity


Author Comment

ID: 16964595
Okay, I updated the RAID drivers in Windows; could not update the controller BIOS - long story; and updated the MB software package & bios.

Something interesting happened after I updated the RAID drivers. The verification failed, but instead of immediately rebooting like usual, the computer stayed on-line and gave me an error message that the second channel had failed. This is the channel I had been watching and suspected was a problem. This in hand, I moved the disk on that channel to the first channel as the slave. It has been stable since, and will verify.

However, I can't run both disks of the mirror off the 1st channel, it halves the performance. I think the card is out of warranty, Highpoint won't do warranty work on cards bought from resellers, and NewEgg doesn't sell the card anymore.

What kind of problems am I going to have moving the array onto a new controller? I'll basically have to make disk images and start from there, won't I? Is there any chance a new card, even a Highpoint card, will recognize my existing array? I don't think so... but... well, comments anyone?
LVL 44

Expert Comment

ID: 16966173
The only way another RAID controller will recognize the existing array is if it is the SAME chipset on the controller card, and the same version, in which case you just plug the array and hope it works.  Usually this only works for mirror RAID 1 anyway.  I think it is safe to assume that if you want to move beyond the existing controller and its problems, you will have to wipe the array and start again.

But this is easier than you think.  Just install a good old IDE drive, copy all the data in the array to the IDE, and make sure the disk is bootable.  Remove the RAID from the system, boot from the CD, and make the IDe drive bootable from running fixboot C:  from the windows XP boot CD in recovery console.

Once you know the system can boot from this IDE, then it does not matter what happens to the RAID, get a new controller, reinitialize it, and copy all the data back -- but at least use RAID 1 or RAID 10 so that you have a mirror in the future, raid 0 and raid 5 are very prone to failures on removal of a drive.

Author Comment

ID: 16971350
Yeah, we're using RAID 10 currently. Its been very robust until this current problem.

A worse problem developed last night after I left - the stripe of the mirrors broke. Until now it's been one of the mirrors that breaks, which can be easily rebuilt with a spare drive. But the stripe broke somehow, reducing the problem to the same as a broken RAID 0 - I don't know of a way to recover this! From what I understand, RAID 0 breaks are unrecoverable in most situations.

I put in two spares, booted, and the controller didn't recognize the spares as useful - at least, there was no option to rebuild. I wouldn't really expect one, having it reduced to a RAID 0 situation anyway. It looks, at least intellectually, that a more robust system would be a mirror of two stripes, as opposed to a stripe of two mirrors...

Well, good thing we backup every night. Too bad its just the database, and not the OS and entire system, though.

Yeesh. Looks like I got some work to do. I'll get back here with my resolution for posterity and points.
LVL 88

Assisted Solution

rindi earned 690 total points
ID: 16971414
There is a software you can use to recover a broken raid 0, but a restore from a backup is usually the real way to go. I strongly recommend you change the raid controller now.

raid reconstructor:


Author Comment

ID: 16972274

The array is a 1/0, which is a stripe of mirrors. When I say we're "reduced to a RAID 0 situation" I just mean that  the stripe between them broke. So we have two mirrors that are no longer striped, each mirroring only half the data.

Does anyone know of a tool to rebuild a RAID 10 array? The Raid Rebuilder from GetDataBack is only for RAID 0 and RAID 5. Highpoint was supposed to email me a tool, but I haven't seen it yet, and it was supposed to be here a few hours ago.
LVL 88

Expert Comment

ID: 16972630
Check the software out, it can rebuild a broken raid0 and therefore also a broken raid10.

Author Comment

ID: 16972886
I pulled a drive from each mirror and am running the Raid Rebuilder on them now to an image on an external HD. Let's see how this goes.

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article shows how to use a free utility called 'Parkdale' to easily test the performance and benchmark any Hard Drive(s) installed in your computer. We also look at RAM Disks and their speed comparisons.
If you're a modern-day technology professional, you may be wondering if certifications are really necessary. They are. Here's why.
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
Please read the paragraph below before following the instructions in the video — there are important caveats in the paragraph that I did not mention in the video. If your PaperPort 12 or PaperPort 14 is failing to start, or crashing, or hanging, …

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question