Dell 2850 not booting to OS after restart

Hello,

We have a old 2850 running a hardware raid 5, after a simpe reboot today we're getting an error, which Ive attached. Looking like one of the drives is out?? Any ideas?

what should I do?
IMAG0859.jpg
IMAG0860uu.jpg
ravenrx7Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

 
Brian HarringtonIT ManagerCommented:
do you have a current backup?
0
 
joinaunionCommented:
Try pressing F2 at start up,then goto Integrated Devices Screen Options .
Is raid enabled?
http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390c30.htm#wp1043338
0
 
ravenrx7Author Commented:
yes i have a backup, acronis but im not going that route, i thnk its a probelm with the raid and one of our hards.. join-let me check one sec
0
Network Scalability - Handle Complex Environments

Monitor your entire network from a single platform. Free 30 Day Trial Now!

 
Brian HarringtonIT ManagerCommented:
You may just need to rebuild the array, unfortunately, it may screw up....
0
 
ravenrx7Author Commented:
yes raid is enable - join
0
 
Brian HarringtonIT ManagerCommented:
Are any of the disks showing a blinking orange light?
0
 
helpfinderIT ConsultantCommented:
what if you press any key (to continue)? if only one disk is broken other 2 (I just suppose you have three what is minimum for run RAID 5) should be able to start and handle this situation. Also you should check bad disk and complain it or buy new one and then rebuild RAID again
0
 
ravenrx7Author Commented:
on the hard drives front panel- no sir
0
 
ravenrx7Author Commented:
hit i ht any key it trying to boot from the NIC, which i have a PXE setup, went into BIOS order is HD first
0
 
DavidPresidentCommented:
You lost 1 HDD from your RAID config.  Here is your risk
- If you have any unrecoverable read errors then you will get partial data loss.
- A stressful rebuild will be required.  The hardware is old enough so that you have a real risk of 100% data loss in event one of the remaining disks is in stress.

So reasonable practice assuming you want to do least risk and you don't have decent backup
1. Turn it off.
2. Get a disk large enough for a complete backup.  To save time get an internal eSATA drive and controller or USB3 (not USB2)
3. get a replacement HDD.
4. Kick off full backup, just tell it to continue booting. It will beep and complain.  It MAY even just die.
5. After backup completes, and system is booted to whatever O/S is supposed to be running, replace the failed disk, and it SHOULD automatically rebuild over next day or so.

Better practice would be to image all the disks using a non-RAID controller to scratch drives, after running diagnostics to assess the health.  This is non-trivial, and you'll have to spend money.  

Best practice is to leave it off, and take all the disks to a pro, assuming you want to spend $4000+ and the data is well worth the expense.
0

Experts Exchange Solution brought to you by ConnectWise

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
 
ravenrx7Author Commented:
i have 5 drives
0
 
ravenrx7Author Commented:
I do have a full backup we run Acronis Server backup, so do I procedd with a RAID reapir?
0
 
Heritage02RiderCommented:
Assuming you did just a reboot and did not remove and replace any drive; did you try the standard fix?

Shutdown the server
pull the power cords (all of them)
eject drive 0
replace drive 0
eject drive 1
replace drive 1
continue with all the drives
reconnect power
boot up
0
 
ravenrx7Author Commented:
no spending that cash on that is not an option, rather stupid idea. come on. I have a RAID 5 system and you're saying throw in the towel take it some where? come on
0
 
joinaunionCommented:
These are your options,if you have ROMB KEY installed select Raid otherwise select SCSI

Embedded RAID Controller
      

Selects between RAID Enabled, SCSI Enabled, or Off. The configurable options vary, depending on whether the optional ROMB key and memory are installed.

    With the ROMB key and memory module installed — Select either RAID Enabled or Off.
    Without the ROMB key and memory module installed — Select either SCSI Enabled or Off.

Channel A and Channel B operate independently. If the Channel A displays RAID Enabled, Channel B can be set to RAID Enabled, SCSI Enabled, or Off.
0
 
ravenrx7Author Commented:
let me try that.. i have no reconnected the drives, I havent made any changes in proabbly 3 months
0
 
Heritage02RiderCommented:
another option if you get no love from reseating the drives is to remove one drive and reboot. If you get a different error; like no RAID set, reinsert the drive and move on to the next. When you get a degraded state without an error it should boot. Most likely that drive is bad and/or inconsistent. Once booted in the degraded state, insert the drive, it should rebuild it automatically. I have done this before with a drive that was at the beginning state of failure. Once I determined the drive that was failing, I replaced it.
0
 
ravenrx7Author Commented:
the resets didnt works.

join--so i have channel a raid channel b scsi
0
 
ravenrx7Author Commented:
k Hertiage let me try that
0
 
DavidPresidentCommented:
ravenrx - you misunderstand.   I am saying that IF you do not have a backup, then those drives are years out of their designed lifespan of 3 years.  They may not survive a rebuild.  

So if you kick off a rebuild, and have a hard crash, nothing in the world will be able to get it all back, and you will pay $4000+ for just part of the data.

Or you can spend $4000 and probably get 99.999% of your data.

Or pay $500 - $1000 and take image backups of the raw disks first, just in case.

If you do have a currrent backup, no worries. Tell it to continue and let it boot up in degraded mode. If it does NOT boot in degraded mode, you have metadata failure or mis-match.  A dead battery can cause a mismatch.  Either way, default will be to use the data on the disks, not the NVRAM that is likely unprotected due to dead battery.

Bottom line, tell it to go ahead and continue to boot. No need to do anything in the RAID menu. See if it boots .. (But realize you have maybe 25% risk of losing all you have on the hard drives due to the age of the equipment alone).

If you have never taken and TESTED a full backup and recovery, then I would hire somebody to go onsite.  Too many variables here and you certainly have a multiple failure scenario already.
0
 
ravenrx7Author Commented:
trying the drives swapping and rebooting, now
0
 
ravenrx7Author Commented:
dlthe- yeah it wont boot i do get the mismatched error- i think it shows in one of the images i uplaoded.. battery like the CMOS?
0
 
Brian HarringtonIT ManagerCommented:
battery like the one on the RAID controller.
0
 
DavidPresidentCommented:
Absolutely it is the battery.  I have some old CERCs.  Battery good for 5-7 years typically.   But you have to get the system online and fully operational with all disks before you change out battery.    I think it is that standard 2012 you can buy at any drug store.  Can't remember.
0
 
ravenrx7Author Commented:
can i replace that battery?
0
 
ravenrx7Author Commented:
yeah this server has been up and running since jan 2005 LOL!
0
 
ravenrx7Author Commented:
dang went through the all the drives, swapped rebooted and none went ahead and booted to the OS,
0
 
Brian HarringtonIT ManagerCommented:
You may be able to replace the baterry. Call dell support.  They will take the call, and sell you the battery.  But you still may have to recover from backup.
0
 
DavidPresidentCommented:
THat is why you never saw the problem until now.  But really, take it from me, those disks are high risk of dying.  A parity rebuild is most stress those drives have had since the system was installed and you really do have high risk of a 2nd HDD failure during a rebuild.

I have first-hand experience with drives dying on this very system and controller.  Rather annoying because while the data was protected, had to rebuild the O/S.  <sigh>

It just isn't worth throwing more money at those drives or even system.  You're probably getting 30 whole MB/sec write speed too on that RAID5 with that controller.
0
 
ravenrx7Author Commented:
so none of those raid options i need to do?
0
 
ravenrx7Author Commented:
yeah sad thing is we have a new coming in on the 24th,
0
 
DavidPresidentCommented:
The battery is a secondary issue and is not preventing the system from booting.   You have at least 3 failures all working together to give you grief
 - metadata mismatch
 - No NVRAM, so no event log
 - probable disk failure
 - probable unreadable blocks on surviving disks
 - near certain XOR parity mismatches
 - Data corruption (highly probable, as the system was already w/o NVRAM due to dead battery, so you did lose at least a little bit of data
 - I doubt you have ever done a consistency check or scan/repair bad blocks - Well, no way you have.. So this means you have XOR parity issues and certainly unreadable blocks. it is almost a statistical impossibility not to have them.

You need to at least image each disk to scratch drives with a non-RAID controller, and try to reconstruct with that using some commercial software from runtime.org, but that software will NOT deal well with read errors, or a partial parity rebuild - no way of knowing due to the dead battery.

If the data isn't backed up, you need to pay somebody to at least try to do this for you, unless it all makes perfect sense and you have the hardware to do it.  Maybe $1000 to reconstruct it all to a single non-raid disk drive you can boot, with minimal damage.   (After they run hardware diagnostics to assess the risk).

No way could i talk you through this, as you don't have the software anyway.
0
 
Brian HarringtonIT ManagerCommented:
It knew it was about to be replaced.  That's how they roll.  Try calling dell tech support.  you still have phone access with them.
0
 
DavidPresidentCommented:
support.dell.com, there is a phone number on the screen,
0
 
ravenrx7Author Commented:
yeah I have an old dell 1800 server im inthe process of restoring my image just so i can get the users up and running on thi, that works when the new server comes in ill just restore to it, in the meanwhile ill have this 2850 with 5 drives being not used.. i cant just recontructed it,  or just say screw it.
0
 
joinaunionCommented:
You may want to give this a read,
http://support.dell.com/support/edocs/systems/pe2850/multlang/ROMB/F6586A00.pdf

What happens if you set both channels to SCSI?
0
 
noxchoGlobal Support CoordinatorCommented:
Rebuild the array and restore from backup. You loose more time and nerves with these attempts to fix the RAID than you would loose trying to restore from backup.
0
 
andyalderCommented:
0 physical drives detected, so it could be backplane power missing, data cable disconnected or one disk pulling the whole bus down.

I would power up with just one disk installed and see if it spins up and gets detected. It won't rebuild of course with 1 out of 5 disks present but it'll give you more info, then if it is one naughty disk you can power on with 4/5 of them present and rebuild from a normal degraded state. If not at least you have a backup available.
0
 
ravenrx7Author Commented:
I dont know if this matters but this mornig came booted up the server ( crashed server) and the first two bays have amber lights blinking..
0
 
ravenrx7Author Commented:
when i went to rebuild, I show 5 failed drives.. so i need to replace that battery right
0
 
Heritage02RiderCommented:
Sounds like a double disk failure. Very rare indeed. A restore is probably your only option now. You will need to get two new disks.

If it booted this morning, perhaps the drives were cool and are overheating. Really doesn't matter, they will fail again.
0
 
ravenrx7Author Commented:
oh it didnt boot, i just noticed the blinking leds which werent on yesterday, so its possible, i replace those two disk and it still wont boot right
0
 
andyalderCommented:
Does it still say "0 physical drives found on the host adapter" during POST?
0
 
Heritage02RiderCommented:
Correct. RAID 5 can tolerate a single disk failure. The extra drive is used for parity. Actally the parity is spread across the drives, but two drive failure means it won't be able to find the data.
0
 
noxchoGlobal Support CoordinatorCommented:
1. Replace the battery
2. Replace the drives
3. If does not help - rebuild.
0
 
ravenrx7Author Commented:
ok understand now, just a question..someone was talking about RAID or SCSI i took a picture of this, is there a ay to move this over to SCSI? would that work
IMAG0863.jpg
0
 
andyalderCommented:
Still all drives seen as failed/offline I see, do you have a spare disk to test with?
0
 
ravenrx7Author Commented:
I dont .. ughh.. I'll order a few disk, replace the battery which is on the RAID card right , rebuild array and then restore backup, man what a pain!
0
 
andyalderCommented:
Since none of the disks are seen the data on them may be intact, what you are seeing is the same as if the SCSI cable had come off. I'd still power it on with just one disk in and see if it sees it (obviously dont accept to change any settings with just one disk it's just for test).
0
 
joinaunionCommented:
If the ROMB battery has indeed failed then no you can't use SCSI. Its dependent on the battery also.
You said there were flashing lights ,what color were they?
Here is a list of what the lights mean for troubleshooting.
http://support.dell.com/support/edocs/systems/pe2850/en/it/t1393c20.htm#wp1039173

Run system diagnostics instructions here.
http://support.dell.com/support/edocs/systems/pe2850/en/it/t1393c40.htm#wp1033246
0
 
joinaunionCommented:
Hows things going?
0
 
joinaunionCommented:
You still with us?
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.