HP Smart Array 532 Controller stuck in interim recovery mode

We are running an older HP Proliant ML370 server and for some time now it's had 1 drive in our 4 drive RAID 5 show as bad.  This behavior started some time ago and we used to be able to reboot the server, and it would recover the RAID and then be fine for a few months.  It finally got to the point where no matter how many times we would reboot, it would not come back up, and now the drive is always amber.  We got a replacement drive from our HP supply reseller, and had no change.  We thought we had gotten a bad drive, but the replacement's replacement did the same thing.

If we go into the RAID configuration utility in the BIOS, it does not see that there is a drive present in that slot and has the RAID in interim recovery mode.  In the HP Array Configuration Utility also lists it in interim recovery mode and has "???" instead of a GB size of the drive.  The drives status is failed.

We are getting to the point of wondering if we either have a bad port, or a bad RAID Controller.  The server is old enough to where we don't really want to put any money into it, but we can't phase it out just yet.  Please let us know if we are looking at a physical problem, or if we are missing that magical "rebuild array" button buried deep somewhere.
wwakefieldConnect With a Mentor Commented:
The rebuild is definately automatic.

Sounds like the Cage may be going bad.   Do you have any other drive cages you can put in to test?

Since it is older, assume not a core piece of equipment, have you considered rebuilding the server or upgrading the HP Management tools?   Prhaps a newer SmartStart version will give more information.
andyalderConnect With a Mentor Commented:
Ditto the above, unlikely to be the controller since a fault would affect more than one drive on parallel SCSI. If you haven't got a spare server to try it out in do you have a spare disk slot in the current one?
genequipAuthor Commented:
The server is important enough we need to keep it working, but not important enough to go through the effort of rebuilding it.  We need to keep it limping along for a little while yet till we can phase it out with new equipment.

On your suggestion we will try out the SmartStart Tools, we had not actually tried those yet.  (Perhaps we should have.  :) )

We have an identical Proliant ML370 server currently not in production so we have some spare parts to work with.

In response to andyalder, we could have a spare disk slot if needed.  Would there be a way to rebuild the array using a different slot?
wwakefieldConnect With a Mentor Commented:
Swap the cage out.....

Note I know physically it is easy, but unsure what you have to do logically.

andyalderConnect With a Mentor Commented:
I was thinking more of just putting the disk in the spare slot to see if it was properly recognised. Annoyingly you can't use it in the spare slot as part of the array because I don't think it will let you make it a hot spare while the array is degraded.

Swapping the cage as suggested above would be better, procedure is in the maintenance and service guide which you can get grom HP's support and download area, I can't give you a link because I don't know which generation ML370 you have.
genequipAuthor Commented:
We were unsure where to go on our problem, and these guys gave us some good ideas to try going forward on our issue.  
