RAID 5 Array "Failed" was there now not

Power outage recently and this server has a powervault 220s attached to it that had two array's on it.  Initially after reboot when the power came back I had one of the arrays, then after several other reboots the other array, the one that is now in a "failed" state, comes back after a screen that says "rebuilding.....".  Now again today, it is gone.  

So it was gone, then came back and now is gone again.  The Adaptec RAID controller says that it has a missing member.  So how would I go about handling this.  The Data on this array isn't crucial but I really really would like to get it back, even if just long enough to move it off.  As it hold some backed up data.  

Any suggestions would be greatly appreciated.

BTW:  Running Server 2003 , Power Vault 220s along with DELL PwrEdge 2850 and 15 300GB drives total between the enclosure and the server itself.

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DavidPresidentCommented: has a product called raid reconstructor.  Free to let it try and figure out the config, and if successful, you pay to recover it to a scratch disk.    You will need a non-RAID  adapter so it has access to all blocks. The software runs under windows.

Price is around $100.
macwalker1Author Commented:
Thanks for the response but I'm thinking this isn't going to be recoverable.  It's interesting that after a couple of reboots, the display had a message of "rebuilding" and after it did, the array was there.  Then when the power went off again, completely, then it was gone.  If I go into the Adaptec controller by pressing CTRL+A on startup, it shows the array as DATA2 which it is, 6 drives of 273 GB size but says it has failed and is missing a member.  But it does show up.  However, that's the only place it shows up.

I ran the eval of the software you suggested but I'm not sure I am using it correctly.  As I made the following selections.  RAID Type:  RAID-5  Number of drives: 6 then I have the option of selecting drives.  There I have two choices.  See the attached pics and see if that helps.  The "Analyze" button is inactive.

So I'm not sure what to do.  The drives are 300 GB drives so based on the "size" in blue there at the bottom of the capture, it appears that it is "seeing" that there is an array there, albeit not as large as it should be given there are 6 300 GB drives there.  Am I understanding that correctly?

I suggest you get a UPS, these power outages are just making it worse.  The adaptec (i believe) does not have a built-in battery to record state as power is lost, so it really has no way to know exactly where it left off.    The product should be able to rebuild it, but if power is lost, then the runtime software has to be restarted.  If you had a license, you could save the metadata.   But don't buy a license yet, you should really just get a UPS.  I would NOT power up the array any more w/o stable power.  Each cycle, as you have seen can and will make it worse .. up to the point where you lose everything.


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

macwalker1Author Commented:
I should have clarified that all of my servers are connected to UPS units, however they can't stay up long enough for me to get back to the office when something like this happens.  They may last tops, 20 minutes.  They are not equipped to allow for an unattended shutdown so by the time I realized they were down, it was too late.  

The power has been stable since Sunday. The power on the "A" phase coming into the building was weak.  So we're good.  

When booting it says "Battery module present on the adapter" or something to that effect. Like I said, it wasn't there, neither of the array's were there, then rebooted and one came back, the one I have now, 838 GB then another reboot or two and the other came back.  However when it did, its total size was decreased as one would expect given the "lost member".  It was originally somewhere around 1.8 TB's I think, then when it came back it was about 1.6 TB's or there in that area.  But it's not there now.
The runtime product won't allow me to analyze for some reason, if you'll notice the button isn't activated in those screen shots.  

Uninstall and try again?

Thanks in advance.

The RAID firmware will take the array offline to prevent further damage after several reboots, especially during rebuilds.   Only way to get it back is with something like runtime's software, or better yet, a human/professional recovery firm.    The runtime software is a good consumer product, but there is no substitute for in-house software built over the years, plus experienced humans who can eyeball the raw data on the physical disks and figure out best way to recover.

Plus, you MUST use a JBOD SCSI adapter.  If this controller is configured using the non-RAID firmware, then you are OK, but if you are using the controller with RAID enabled, and told it to present JBOD disks, then it probably won't work right anyway.  RAID controllers steal blocks at the beginning usually, for metadata, and the recovery software needs to see that stuff.   No way for me to tell if that is the case from the screenshots
macwalker1Author Commented:
Sorry so long to respond on this.  Given the fact that this set of drives and the data on them were not "critical", I'm not going to invest a lot into this.  

I've since learned that while installing "Reserved" signs in the parking lot for the owners of the building that the crew drove one of the sign posts into a 7200 Volt feed coming into the building!  This was the source of the power issue.  Amazing someone wasn't killed!

At any rate thanks for the info on all and I'll close this topic and open another about the best practice with regard to setting these servers and UPS' to do an unattended proper shut down.
macwalker1Author Commented:
This is neither resolved or needed any longer.  Thanks for all the help on this one!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.