Solved

How do I diagnose failed SCSI disk?

Posted on 2009-05-16
5
384 Views
Last Modified: 2013-11-05
I'd appreciate some advice re diagnosing and repairing/replacing a failed SCSI disk in an out-of-warranty generic Intel server running Windows Server 2003.  The machine runs SQL Server 2000, and I use it only for data analysis.  There are three Seagate Cheetahs on one channel of the motherboard SCSI controller, the 34GB boot disk and two 68GB disks.  The SQL data files that I'm actively using are on a 600GB MegaRAID SCSI RAID 10 array, and others (e.g., raw and outdated data) are on a 1TB ReadyNAS Duo (via NFS share).  One of the 68GB Cheetahs contains the SQL log files, and the other contain(ed) temporary working databases.

In preparation to replace a failing disk (one of those notorious 1TB Seagate Barracudas) in the ReadyNAS Duo, I decided to backup all of the databases on that device.  The next morning, I found that one of the 68GB Cheetahs, fortunately not the one containing the SQL log files, had disappeared from the server.  Looking in Disk Management, I saw that disk listed as missing/offline with status failed.

I tried reactivating the volume, but that did nothing.  Is there anything else that I can do to diagnose the disk before replacing it?

Based on what I've found on Experts Exchange, I gather that one possibility is a bad SCSI cable.  Given that the other two disks on that cable are OK, perhaps it's a loose connection.  Before replacing the disk, I'll try re-seating all of the connectors on that cable.  If that doesn't bring the disk back, I'll replace it, using the same ID for the new disk.

Does that sound reasonable?  What have I missed?
0
Comment
Question by:drjimcook
  • 3
  • 2
5 Comments
 
LVL 1

Accepted Solution

by:
wfaleiro earned 500 total points
ID: 24405042
Most probably the disk has gone bad. It is very rare for the cable from a server to just go faulty/loose unless you were working on it recently. The most obvious thing you would do is borrow a working cable from another system (not from this array on the server) and test the current disk with that. if it still appears offline, you got your answer.

If you have a alternate disk then that is even a better approach to fix. just try connecting it to the place where you have the disk listed as offline and check if it appears online or the raid rebuilding.

--Walter
0
 

Author Comment

by:drjimcook
ID: 24408842
Thank you, Walter.  What puzzles me is that I saw no disk-failure-prediction warnings, and didn't see any bad sectors.  The disk was seemingly fine, and then it was gone.  OTOH, that machine did have major problems last summer, after a lightning spike that overwhelmed the UPS, and went back to the seller (sans disks) for a new motherboard.  Perhaps the circuit board on that drive was also damaged, but took another year to fail.  If that were the case, replacing the disk's board might bring it back, right?  Perhaps I'll play with it when I have some time.

While messing with this machine, I'm going to add a 176GB RAID 1 array for the database log files, and replace the boot disk with a new 73GB disk (using Symantec Ghost).  At least temporarily, I'll use the still-working 73GB disk to mirror the new boot disk.  I only have one remaining RAID channel.

Does that make sense?

Thanks again.  BTW, I'd like to award points after getting into the machine, which won't be until next weekend.
0
 
LVL 1

Expert Comment

by:wfaleiro
ID: 24409029
Well as long as you have given a thought to the usage months/years ahead for space considerations its fine. I believe the raid controller does not support raid6.

--Walter
0
 

Author Comment

by:drjimcook
ID: 24409309
That's a very good point, Walter.  Unfortunately, this is not a good time for me to buy a new machine.  Also, the motherboard, memory and one of the CPUs are just a year old, and the machine is otherwise adequate.  I'd like to make it work for another year or so.

I typically receive as much as 200GB raw data per project.  After initial analysis, I can usually move the raw data to the network or offline, and keep just aggregated data locally.  However, it does appear that I'll be getting more raw data soon than I have space for.  If that happens, I'm planning to buy a 3ware 9690SA-8E with four 1TB Western Digital RE3 SATA II disks as RAID 10 in an external enclosure.  That'll cost less than $2K, and I can use the disks in the next machine.

Reasonable?
0
 

Author Comment

by:drjimcook
ID: 24532621
Well, I finally made time to work on the machine.  And amazingly, the disk came back after I reseated the power and data connections.  I still replaced it, however.
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Many people find themselves in a situation while using PC's that they suspect or find out they have a hard disk problem. The big question then is what to do about it; how to handle the problem, and in which order.  This article is intended to help p…
Your hard drive is full! Do you know what is filling it up? A small free (trial period) utility will solve the problem.  It is SpaceMonger.  It can show you visually the content of your hard drive as blocks with different color and sizes depending…
This video Micro Tutorial explains how to clone a hard drive using a commercial software product for Windows systems called Casper from Future Systems Solutions (FSS). Cloning makes an exact, complete copy of one hard disk drive (HDD) onto another d…
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now