Solved

RAID 5 Container #1 Dead, any hope for Data recovery?

Posted on 2004-10-21
3,806 Views
Last Modified: 2013-12-01
Dear Experts Exchange -
Our school has a Dell PowerEdge 2500 which has major problems after a weekend power outage. Students and Faculty need data from this server, can you help?

Dell PowerEdge 2500
5 disk drives, RAID 5
2 containers: #0 4GB with OS, Windows 2000 Server and #1 131GB named H:\ for file storage
Perc 3 - adaptec

Monday am I arrived to the server in an off state.
Turned on and received "no boot device available", with 4 of 5 disks status lights slowly going from green to amber.
Called Dell, container information was OK, found both containers.
After scrubbing both containers was able to install a parallel OS and access data on H drive.
Tried to setup a backup to tape drive.

Tuesday am
I arrived and server was in an on state with errors.
Three files were giving a corrupt or unreadable error message.
Since the backup didn't work, I started to copy data to other various locations on the network.
Received a message stating H:\ is not accessible the file or directory is corrupted and unreadable.
I thought I could let it continue to try to copy data off...

Wednesday am
I arrived and the server could not be seen on the domain and was unresponsive, black screen.
Hard power off, back on and message " Array controller monitor failed ".
Several reboots, 50% of time will find array controller, 50% not.
Whe it does find the array controller drives that it sees are varied, sometimes drives 0,2,3 show up, other times drives 0,1,4...
Status is container #0 critical, container #1 critical
Called Dell, reseated drives, accepted configuration changes for the array controller, Dell ordering/shipping parts to rebuild the scsi chain.
Decided to update Bios and flash ESM. Container #0 OK, container #1 critical.

Thursday am
Boot up, array controller monitor OK, container #0 OK, container #1 critical. Able to access OS, try to access H:\ and recieve message "H:\is not accessible the file or directory is corrupted and unreadable" Now, in the continer configuration container #1 status is DEAD.
Still awaiting a technician to replace parts ordered by Dell.

If hardware replacement doesn't help, is there any hope of recovering data?
I've called a few data recovery companies, any suggestions?

thank you very much.



0
Question by:kimzmn
    13 Comments
     

    Author Comment

    by:kimzmn
    Thursday mid-day update
    Dell requested I run DSET and send the file to them.
    On boot, container #0 ok and container #1 scrubbing
    Running Elite hard drive diagnostics, all 5 drives passed
    Now drive H:\ does not even showup in Windows Explorer
    Ctrl +A on bootup and in Manage Containers, container #0 status OK and all 5 drives show. container #1 status Dead and only 3 of 5 drives show as members.

    Yikes...help please
    0
     
    LVL 4

    Expert Comment

    by:tmenasco
    Do you have another server of the same model in the data center?

    If so, mark the order of these drives and put them in the other machine in the exact same orientation and see what happens. This will tell you whether it is the drives or the server.

    I utilize this as a troubleshooting routine regularly to help diagnose, but have not tried it on a Dell, just HP and Compaq. HP and Compaq store the array configuration data on the drives themselves and it can be copied to a floppy for backup purposes.

    Is the aray controller a PCI card or onboard? If it is PCI, try another slot.

    Good Luck...
    0
     

    Author Comment

    by:kimzmn
    I do have another Dell PowerEdge 2500 across the street and I believe I am understanding your recommendation.
    Yes, the array information is supposed to be on the header of each drive as well as on the key card.
    I'll look into extracting the array configuration data to a floppy.
    There is no other available PCI slot that I can see.

    I will admit, I am really concerned about taking the drives over to another box and fear then having two servers unavailable.

    Anyone else have further ideas?   Should I allow Check Disk to run?

    Thank you

     - Kim
    0
     
    LVL 4

    Expert Comment

    by:tmenasco
    Trying the disks in another server will not damage the other server, but could help in troubleshooting to determine if it is the disks or the server that is a fault.

    You might also check and verify that the SCSI bus is correctly terminated or make sure a terminator has not been removed or loosened by accident.

    If the array controller is a PCI card, at least remove it, clean the contacts and replace it. You could maybe swap it with another PCI card if all of the slots are full. I would much prefer to have a NIC not working if the slot is the issue.

    Good Luck...
    0
     

    Author Comment

    by:kimzmn
    A Dell service technician arrived yesterday and replaced the motherboard, scsi cable, backplane, array key and memory.
    Both containers were found, scrubbed, and Ctrl+A container mgmt show both containers status OK and all 5 drives available.
    The containers show #0 as 4GB and #131GB.
    Boot into windows, did not let check disk to run, and trying to access H:\ gives the following message "H:\ is not accessible. The file or directory is corrupted and unreadable".   On viewing the properties of H:\ it reads 0 space available and 0 space used.

    Oh no....what's my next step? Sounds like I need to label the disks and bring the disks over to the other PowerEdge 2500....which of course is a production server and I'll have to wait until the weekend.
    0
     
    LVL 4

    Expert Comment

    by:tmenasco
    I would try to get my hands on a copy of Server Magic by PowerQuest.

    Since all of the server side hardware is new, it could be an issue with the zero sector of the partition. I have used Server Magic and Partition Magic before to repair this kind of thing.

    I just went to the PowerQuest.com site to find out that Symantec bought them last year and Server Magic for Windows is no longer listed. You might ask around and see if one of your IT buddies has a copy. It works wonders and I am not sure why it is no longer offered.

    Are there any diagnostic utilities in the array controller that could possibly idetify and correct the error?

    Good Luck...
    0
     

    Author Comment

    by:kimzmn
    No luck in finding a copy of Server Magic for Windows.
    I ended up sending the drives to DriveSavers.
    Thank you for your help
    0
     
    LVL 4

    Accepted Solution

    by:
    What is that going to cost you?

    Is Dell footing the bill since it was their array controller that caused the problems?
    0
     

    Author Comment

    by:kimzmn
    Very Very Expensive.
    Dell will not foot the bill.
    0
     

    Author Comment

    by:kimzmn
    You won't believe this!!!

    DriveSavers is still working to recover data. I signed up for 2-3 business day service and it is now going on day 6!

    Plus, the server has failed again!

    I bought 5 brand new hard drives to begin rebuilding the server so it will be ready for the data.
    Dell walked me through step by step:
    initializing the drives
    creating a container
    waiting for them to scrub
    installing openmanage server
    installing windows 2000 OS
    and as I was waiting on hold for 80 minutes to find out what my next steps are...
    I reboot the server, and it does not come back up.
    "A disk read error occurred, press Ctrl+Alt+Del to restart"
    2 of the 5 drives are giving an amber status light
    I rebooted and did a Ctrl+A
    The container status is DEAD

    ARGHHHHHH....

    Dell will not ship me a new server.
    On top of the 5 parts they replaced a week ago, and the new drives, now they insist on sending
    9 more parts.
    Another backplane, another raid-key and this time power supply parts.

    Why can't they just send me a new box completely?
    0
     

    Author Comment

    by:kimzmn
    Dell replaced several other parts the next day; the power distribution, motherboard, backplane, cable assembly, raid key, scsi cable...
    I believe it was the replacement of the power distribution and cables which were key in fixing the server. I believe this because all the other parts were replaced before and the server failed a second time.
    With a rebuilt server I am now able to copy data back onto it.

    DriveSavers did take longer than the 2-3 days I signed up for, however, I can't complain too loud because they sent back what we believe to be greater than 90% valid usable data.
    0
     

    Author Comment

    by:kimzmn
    Thank you tmenasco for your dialog on this issue. I accepted your answer as a thank you for your participation even though I chose not to use your solutions.
    0
     
    LVL 4

    Expert Comment

    by:tmenasco
    Thanks. Sorry I couldn't solve your problem. But it looks like since it was hardware, Dell was the only on who could.

    You don't happen to have an old raised floor? If so, go to www.nwfusion.com and look up "zink whiskers", I think the DocFinder number is 4461.

    Good luck...

    Tom...
    0

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Cisco Complete Network Certification Training

    If you’re an IT engineer or technician, it's time you take your career to the next level. This elite training bundle is brimming with all of the information you need to learn to sit for Cisco CNNA, CCNP, and CCENT certification exams.

    AWS Glacier is Amazons cheapest storage option and is their answer to a ‘Cold’ storage service.  Customers primarily use this service for archival purposes and storage of infrastructure backups.  Its unlimited storage potential and low storage cost …
    VM backup deduplication is a method of reducing the amount of storage space needed to save VM backups. In most organizations, VMs contain many duplicate copies of data, such as VMs deployed from the same template, VMs with the same OS, or VMs that h…
    This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…
    This tutorial will walk an individual through the process of upgrading their existing Backup Exec 2012 to 2014. Either install the CD\DVD into the drive and let it auto-start, or browse to the drive and double-click the Browser file: Select the ap…

    856 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    16 Experts available now in Live!

    Get 1:1 Help Now