Link to home
Start Free TrialLog in
Avatar of wyatt3
wyatt3

asked on

Compaq 1600r RAID5 - will not boot. Controller, drive or ??

When I try to boot the system, I get error message 1787 - Slot 3 Drive Array Operating in Interim Recovery Mode.  The following SCSI drive should be replaced:  SCSI Port 2:  SCSI ID0.

Then it says:
Scanning for SCSI drives.... No SCSI drives detected.
Press F1 to continue.  F10 for system partition utilities.

When I press F1, it takes me to the SmartStart 5.5 CD, which I have in the CD drive.  
If I choose Array Diagnostics, it says:  Logical Drive 1 status = interim recovery (volumn) functional, but not fault tolerant).  SCSI port 2, ID0 failed - REPLACE

If i go into the Array Configuration Utility, I get the message:  The Compaq Smart Array 3200 Controller in Slot 3 has a bad or missing drive attaced.

There are 3 drives in the server, with 6 possible slots.  I have tried moving the drives to different slots with the same result.

Is it a bad controller card, a bad drive or RAID improperly configured?  

Thank you
Avatar of David
David
Flag of United States of America image

according to the dump ... you won't be happy ... you lost one drive, but you have a non-fault tolerant array.

i.e. RAID0.  unless the controller went a bit brain dead, you lost all your data and need to contract a data recovery specialist to try to revive it all
SOLUTION
Avatar of noxcho
noxcho
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Member_2_231077
Member_2_231077

I see no dump (is it pasted from ADU but I don't see link?) but according to the message it should be running fine albeit without redundancy. F1 should boot with the array disabled, what if you press F2? It should boot if you have the disks in the right slots. There again the exact text of the F1 / F2 message is important.

You can ignore "Scanning for SCSI drives.... No SCSI drives detected" as that's the onboard SCSI which isn't in use unless you've got a tape drive on it.

"interim recovery (volume) functional, but not fault tolerant" means exactly what it says, the volume and data are good, one disk is missing from a RAID 5 array.

So if you can put the disks where they started from and you haven't flushed the cache to the wrong disks by shuffling them you can just boot up and go and replace the failed disk as soon as you get another one.
Just to add to Andyalder's comments:

Some controllers are able to follow drive movements when you shuffle them as you have. You may be fine there. However, this is not a good idea unless you know what you are doing and the capability of your specific controller. I also read it to mean that this array should still be functional. What happens if you attempt to boot to the OS?

As long as booting the OS works, I would highly recommend an increased frequency of backups (start with one immediately and test the backup) until this is resolved. If you can boot and get good backups, this is likely just a drive failure which is easy to resolve on a raid 5.

If this is the situation you find yourself in, I would recommend getting two drives rather than one. You will need one to replace the failed drive and I would recommend a second drive which you should be able to configure as a hot spare for the array in one of the currently unused slots. The hot spare will be an idle drive that will allow the raid controller to automatically start a rebuild in the case of a future drive failure.

This is all assuming, of course, that this data is important enough to justify the cost.
Avatar of wyatt3

ASKER

dlethe and noxcho - Compaq Array Configuration Utility (SmartStart 5.5) clearly shows this as RAID5.  What you seem to be suggesting is that a failed RAID5 array morphs into a non fault tolerant array, which only makes sense to me if there is a controller failure....or something I am missing.

andyalder and sifuedition - the system does not boot into the o/s (server 2003).

More information - I tried moving the failed drive (it lights orange, as opposed to green on the other drives) to another slot.  In the Array Configuration Utility it still shows 'Port 2, ID0' as failed, which is the same 'Port 2, ID0' that it showed in a different slot.  Shouldn't this info change when it moves to another slot?

I do not have a back-up for this system :(

Possible solutions - Would one solution be to try a working drive (which I currently do not have) and see if by some magic it rebuilds itself?  Same idea for the 3200 controller card?

I have described this situation to one or two data recovery companies over the phone and they said they possibly could recover data from the drive but they could not clone or image that information to a new drive.  Is trying to do this type of thing possible?  I don't mind spending some money (couple hundred dollars), but at some point it would be economically unwise.

Another question - is all information (o/s, registry for example) striped across all drives?  Or is it just 'data'?

If you need more info, please let me know.  Thank you.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of wyatt3

ASKER

Thank you to all who responded.  I am going to explore the possibilities with raid reconstructor.  Special thank you to sifuedition for your last comment, which I think does the best job of explaining the situation.  Also, thank you to noxcho for mentioning raid reconstructor and to andyalder for helping me visualize how to implement it.
Avatar of wyatt3

ASKER

I would grade A and all yes, but my problem remains open.