asked on

Intel 631xESB controller, Raid 1, Checking Smart status of drives

Hi All-

I've got a low end SATA server at a colocoation facility that I manage for a non-profit who I volunteer for. Physical access to the server is possible, but is a real pain, so if I can avoid it, I'd like to.

It's a Windows 2003 machine with a built in Intel 631xESB/632xESB raid controller. 2 of the drives are in a Raid 1 configuration for redundancy. Recently, I've seen sporatic slowness, almost unresponsiveness. It lasts a minute or two and then things resume. I've tried disabiling the public network interface and connecting only via the private lan to it, and I still see the problem. I don't believe it has anything to do with system load.

These are Seagate 7200.11 750gb drives in the machine, and I know they have problems. I'm hoping to replace them. But until the budget is approved, I want to see if I can figure out this problem. I've had these drives fail in workstations before, and smart reporting shows that they're bad.

However, wtih this Raid1 setup, I can't see the smart status with tools like
Active @ hard Disk Monitor (http://www.disk-monitor.com/).

Does anyone know of a way to check if the drives are bad without physical access to the machine?

Second, if I find that a drive is bad, if I pull it out when the system is shut down and replace it with a same sized drive or one that's bigger, will the intel Matrix Storage manager automatically recognize the new drive and copy the data from the remaining current data disk to the new one so that the Raid 1 is live again for redundancy?

Thanks.

ASKER CERTIFIED SOLUTION

David

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Berkson Wein

ASKER

The drives weren't $50, they were $49!!

These are the drives I've got to work with. It's a non-profit, and unfortunately there's not much of a budget for anything, let alone disks. The system pre-dates my time there.
I'm trying to find a way to work with what I've been given.
The Intel Matrix Storage Manager software does report on data consistency, and that's fine. It would be nice it it would alert if there was a problem without manually having to check it, but again, I've got what I've got. The tool doesn't look at SMART.
I'm hoping to find someone who has experience with this specific controller and the software who can advise on this specific issue - and address my question on replacing drives one at a time and rebuilds.
Thanks again.

Berkson Wein

ASKER

Do you think this drive would make a substantial difference? It's refurbished and I know the risk there, but it is under $60. Barracuda Enterprise ES.2. ST3750330NS
http://www.buy.com/prod/seagate-barracuda-enterprise-es-2-750gb-sata-300-7200rpm-32mb-hard/q/listingid/56348131/loc/101/212654841.html
I don't know enough about their enterprise line to know if this is one of the models that should be avoided. Thoughs appreciated.

David

Refurbished drive? Does that mean it is a seagate-supplied refurbished drive, or did some dealer buy used SE.2, and dust it off and spit-shine with alcohol. The problem with buying used SATA disks is that the total number of bad block replacements are not obtainable programmatically, so you could end up with a lemon of a drive that has had thousands of bad blocks, and room for only one more.

Don't buy used SATA disks unless they have the specific part number from Seagate that indicates it is a factory refurbished drive. When seagate (or any mfg) refurbishes a drive, the media passes full tests and the grown defect count is near zero.

Berkson Wein

ASKER

These do have the ST3750330NS-R part number.

David

then you are OK to buy them. They should work much better for you then what you have.

Berkson Wein

ASKER

I'll see if I can get budget approval.
In terms of replacing them: I'm ok with some server downtime. If I pull out an existing drive and replace it with one of the new ones, will the server rebuild automatically? Remember, this is Raid 1. if so, I'll let that happen and then swap the other one and let that rebuild.

SOLUTION

David

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

David

First #31885378 does address running data consistency check, which will check that each disk matches. However, the author is under mistaken impression that the lockups and behaviors are an indication that the drives are bad. The reason for these problems is that the disk drive is not qualified or designed to work behind the controller. This problem will absolutely cause behavior he is seeing. Intel has the windows-based Matrix management software that can be run remotely on the download page of the motherboard, but that software does not have true diagnostics.

The consistency check (called a verify in the Matrix controller) is effectively the only test that can be done, but it is NOT designed to be a diagnostic. His controller is incapable of performing diagnostics directly on disk drives, so what the author desires is not possible. The controller also does not have a full API that supports a full pass-through suite of commands to be sent to individual disks, so it is not possible for even a 3rd-party software product to do what he asks.