Compaq 1600r RAID5 - will not boot. Controller, drive or ??

When I try to boot the system, I get error message 1787 - Slot 3 Drive Array Operating in Interim Recovery Mode.  The following SCSI drive should be replaced:  SCSI Port 2:  SCSI ID0.

Then it says:
Scanning for SCSI drives.... No SCSI drives detected.
Press F1 to continue.  F10 for system partition utilities.

When I press F1, it takes me to the SmartStart 5.5 CD, which I have in the CD drive.  
If I choose Array Diagnostics, it says:  Logical Drive 1 status = interim recovery (volumn) functional, but not fault tolerant).  SCSI port 2, ID0 failed - REPLACE

If i go into the Array Configuration Utility, I get the message:  The Compaq Smart Array 3200 Controller in Slot 3 has a bad or missing drive attaced.

There are 3 drives in the server, with 6 possible slots.  I have tried moving the drives to different slots with the same result.

Is it a bad controller card, a bad drive or RAID improperly configured?  

Thank you
wyatt3Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DavidPresidentCommented:
according to the dump ... you won't be happy ... you lost one drive, but you have a non-fault tolerant array.

i.e. RAID0.  unless the controller went a bit brain dead, you lost all your data and need to contract a data recovery specialist to try to revive it all
noxchoProduct ManagerCommented:
It is bad drive - the RAID tool says it clearly. And RAID5 must have minimum 3 drives. If you have three drives and one of them is bad then you do not have healthy RAID.
If you want data from this array then use RAID reconstructor from www.runtime.org
It will create RAID based on images of the drives.
andyalderCommented:
I see no dump (is it pasted from ADU but I don't see link?) but according to the message it should be running fine albeit without redundancy. F1 should boot with the array disabled, what if you press F2? It should boot if you have the disks in the right slots. There again the exact text of the F1 / F2 message is important.

You can ignore "Scanning for SCSI drives.... No SCSI drives detected" as that's the onboard SCSI which isn't in use unless you've got a tape drive on it.

"interim recovery (volume) functional, but not fault tolerant" means exactly what it says, the volume and data are good, one disk is missing from a RAID 5 array.

So if you can put the disks where they started from and you haven't flushed the cache to the wrong disks by shuffling them you can just boot up and go and replace the failed disk as soon as you get another one.
OWASP: Threats Fundamentals

Learn the top ten threats that are present in modern web-application development and how to protect your business from them.

sifueditionCommented:
Just to add to Andyalder's comments:

Some controllers are able to follow drive movements when you shuffle them as you have. You may be fine there. However, this is not a good idea unless you know what you are doing and the capability of your specific controller. I also read it to mean that this array should still be functional. What happens if you attempt to boot to the OS?

As long as booting the OS works, I would highly recommend an increased frequency of backups (start with one immediately and test the backup) until this is resolved. If you can boot and get good backups, this is likely just a drive failure which is easy to resolve on a raid 5.

If this is the situation you find yourself in, I would recommend getting two drives rather than one. You will need one to replace the failed drive and I would recommend a second drive which you should be able to configure as a hot spare for the array in one of the currently unused slots. The hot spare will be an idle drive that will allow the raid controller to automatically start a rebuild in the case of a future drive failure.

This is all assuming, of course, that this data is important enough to justify the cost.
wyatt3Author Commented:
dlethe and noxcho - Compaq Array Configuration Utility (SmartStart 5.5) clearly shows this as RAID5.  What you seem to be suggesting is that a failed RAID5 array morphs into a non fault tolerant array, which only makes sense to me if there is a controller failure....or something I am missing.

andyalder and sifuedition - the system does not boot into the o/s (server 2003).

More information - I tried moving the failed drive (it lights orange, as opposed to green on the other drives) to another slot.  In the Array Configuration Utility it still shows 'Port 2, ID0' as failed, which is the same 'Port 2, ID0' that it showed in a different slot.  Shouldn't this info change when it moves to another slot?

I do not have a back-up for this system :(

Possible solutions - Would one solution be to try a working drive (which I currently do not have) and see if by some magic it rebuilds itself?  Same idea for the 3200 controller card?

I have described this situation to one or two data recovery companies over the phone and they said they possibly could recover data from the drive but they could not clone or image that information to a new drive.  Is trying to do this type of thing possible?  I don't mind spending some money (couple hundred dollars), but at some point it would be economically unwise.

Another question - is all information (o/s, registry for example) striped across all drives?  Or is it just 'data'?

If you need more info, please let me know.  Thank you.
andyalderCommented:
I'd stick a non-raid SCSI controller in a PC and as suggested by noxcho run RAID reconstructor on the remaining disks, this involves imaging them both and letting the software un-stripe the data to another disk. You can't be talking all that much data on a ProLiant 1600.

The thing about not cloning it to a new drive is that they mean they're not going to copy the data back to a replacement set of disks for the 1600, it really isn't worth keeping any more, it's last centuries model - it lasted well.
sifueditionCommented:
As to the OS and data, yes, everything is striped. If it was not, then the disk containing that info would be a single point of failure and defeat the purpose of the raid.

As to moving the drive, that is most likely due to the controller expecting a working drive in that slot, not reporting the moved drive as the wrong location. It sounds like that drive is not being detected at all. A new drive could rebuild into this array just fine...but the data is still at risk. If a new drive would fix this issue, then you would still be able to boot now.

Disk 1  |  Disk 2  | Disk 3
D1-1       D1-2      P1-1
D2-1       P2-1      D2-2
P3-1       D3-1      D3-2

Not sure if formatting will work for the description, but I'll try this. If the "array" I typed shows up correct, this is a simple description of raid 5. D is for Data then a number for the stripe followed by a number for the member of the stripe. P is for parity. In this, you can see the parity bit moves in each stipe. This is the distributed parity in raid 5 "striped with distributed parity". There are reasons for doing that but it's unimportant to your situation.

The point is, if any one piece in any stipe fails, it can be rebuilt with a simple XOR. Because of that, if an entire disk is missing, you still have all the data needed to function and your system would still boot. Consider if that is disk 1 missing now. What may have happened, is you have bad blocks, just for instance in Disk 3. If D2-2 is a bad block that has not been recovered prior to Disk 1 failing, you now have a non-redundant stripe and data loss when Disk 1 failed.

One lost stripe would not likely lead to a system not booting. Odds are, you have more than that wrong in the array at this time. That also means, any stripes missing data for whatever reason, cannot be rebuilt. Replacing the disk is unlikely to resolve the issue. I have seen a few isolated occassions of something like that working, when theoretically, it shouldn't. As to whether or not you want to spend the money on a disk to try, is just up to you.

Like andyalder stated, they can likely recover some of your data, it's just not going to be a "plug and play" solution where you just install drives they send you and you are back up. Likely, you would have to set up a healthy raid array yourself and install windows. Then restore data to the new set up. Also, the data they recover may only be partial.

Unfortunately, it sounds like you don't have a large budget to work with so you will have to decide just how important the data is. I suspect at this time, it will be a lot of work to get un-guaranteed results. The raid reconstructor mentioned before sounds like a nice way to try, but I have no experience with it myself.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
wyatt3Author Commented:
Thank you to all who responded.  I am going to explore the possibilities with raid reconstructor.  Special thank you to sifuedition for your last comment, which I think does the best job of explaining the situation.  Also, thank you to noxcho for mentioning raid reconstructor and to andyalder for helping me visualize how to implement it.
wyatt3Author Commented:
I would grade A and all yes, but my problem remains open.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Disaster Recovery

From novice to tech pro — start learning today.