[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1124
  • Last Modified:

Adaptec 2100s SCSI RAID 1 won’t let me replace “failed” drive.

Last week I noticed an alarm in the server room. I looked around and it was coming from our SMTP server. I knew this was the SCSI controller so I shutdown and ran the SMOR util (Ctrl-A during boot). In the SMOR I see that one of the Drives (ID1) is showing “failed” and the other drive (ID 0) is showing “optimal”. “Ok, no big deal, I’ll just swap out the drive.” I think. I come back at 2AM to do the swap. I have a Seagate ST336607LW, and it’s replacing a ST336607LC. The only difference between the drives is the LC is 80 Pin and the LW is 68 Pin. They aren’t in a hot swap rack or anything the drives are internal with 68 -> 80 pin converters on the cable. When I swap the drive and run the SMOR util this time it’s showing 2 RAID’s. This is odd since there was only 1 RAID on the system. It’s 2:30 AM and I don’t recall exactly what on the screen was but it was reporting that on the 1st RAID one of the drive was “missing information” and the other was “optimal”. The 2nd RAID was showing the mirror image with one drive as “failed” and the other as “missing information”. I attempted to run repair on the array that was showing the "optimal" drive. No dice, the screen blinks and nothing happens, not even an error. I ended up reinstalling the old drive and attempted a repair with it. This time it shows only 1 RAID and I was able to run repair on it. The really annoying alarm is silenced and the system is chugging along nicely, but I can only assume the array can’t be trusted and need to revisit this issue. Has anyone ever seen this before?
0
abraham_roots
Asked:
abraham_roots
  • 2
  • 2
1 Solution
 
Duncan MeyersCommented:
I suspect that what has happened is that your new disc had been used in an array previously - and it retained the array configuration details (which is stored on disc for Adaptec controllers). When you powered the system back up after replacing the bad disc, the controller saw two on-line discs and read the config information on them. You're lucky that the Repair operation didn't corrupt both discs....

The reason the disc dropped out in the first place was probably a parity error on read or soemthing similar - and there's a reasonable chance that it won't happen again, so, as long as the array is rebuilt and the state of the logical disc is Optimal, don't worry too much.

In the meantime, I'd suggest that you review the method for replacing the disc when the server is powered down:
http://adaptec-tic.adaptec.com/cgi-bin/adaptec_tic.cfg/php/enduser/std_adp.php?p_faqid=2200&p_created=987171528&p_sid=iFPkRb7i&p_lva=&p_sp=cF9zcmNoPTEmcF9zb3J0X2J5PSZwX2dyaWRzb3J0PSZwX3Jvd19jbnQ9OCZwX3Byb2RzPTEzMiwwJnBfY2F0cz0wJnBfcHY9MS4xMzI7Mi51MCZwX2N2PSZwX3NlYXJjaF90eXBlPWFuc3dlcnMuc2VhcmNoX25sJnBfcGFnZT0xJnBfc2VhcmNoX3RleHQ9cmVwbGFjZSBkaXNr&p_li=&p_topview=1
0
 
abraham_rootsAuthor Commented:
I thought the drive may have been used in another array when I saw the error. I put the drive on a Windows box the other day and it came up uninitialized. So I initialized the disk and created a new partition but I DID NOT format the disk. I get the same issue when I go back into SMOR. Was that not enough to blow away the RAID information? As for the failed attempt at a repair I knew at the time it was dicey move and I’m surprised it didn’t trash the array when I did it but I had a tarball of everything and figured I could rebuild the system if I had to (It would just be a pain, take hours, and whats the point of having a RAID if had to resort to that =) I'm thinking I should just order a new drive and give it another go. I'm just curious what I'm doing wrong at this point.
0
 
Duncan MeyersCommented:
According to the documentation, you should just need to select the replaced disc then select Rebuild from the RAID menu (ALT-R). If that fails, try Initialise if that option is available from the drop down menu -  but be careful - other Dapatec RAID controllers have a known bug that results in the entire array being overwritten 50% of the time if Initialize is used. If that fails, I'd suggest that formatting the replacement drive would be the way to go. If *that* doesn't work, you could try a low-level format on the replacement drive...
0
 
abraham_rootsAuthor Commented:
Ok, thanks meyersd, good information. Pointing out that bug is reason enough to award the points =)
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now