?
Solved

Dell 2850 not booting to OS after restart

Posted on 2012-09-11
52
Medium Priority
?
930 Views
Last Modified: 2016-11-23
Hello,

We have a old 2850 running a hardware raid 5, after a simpe reboot today we're getting an error, which Ive attached. Looking like one of the drives is out?? Any ideas?

what should I do?
IMAG0859.jpg
IMAG0860uu.jpg
0
Comment
Question by:ravenrx7
  • 23
  • 6
  • 6
  • +5
52 Comments
 
LVL 9

Expert Comment

by:bharrington83
ID: 38388177
do you have a current backup?
0
 
LVL 16

Expert Comment

by:joinaunion
ID: 38388182
Try pressing F2 at start up,then goto Integrated Devices Screen Options .
Is raid enabled?
http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390c30.htm#wp1043338
0
 

Author Comment

by:ravenrx7
ID: 38388186
yes i have a backup, acronis but im not going that route, i thnk its a probelm with the raid and one of our hards.. join-let me check one sec
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 9

Expert Comment

by:bharrington83
ID: 38388190
You may just need to rebuild the array, unfortunately, it may screw up....
0
 

Author Comment

by:ravenrx7
ID: 38388197
yes raid is enable - join
0
 
LVL 9

Expert Comment

by:bharrington83
ID: 38388204
Are any of the disks showing a blinking orange light?
0
 
LVL 19

Expert Comment

by:helpfinder
ID: 38388208
what if you press any key (to continue)? if only one disk is broken other 2 (I just suppose you have three what is minimum for run RAID 5) should be able to start and handle this situation. Also you should check bad disk and complain it or buy new one and then rebuild RAID again
0
 

Author Comment

by:ravenrx7
ID: 38388210
on the hard drives front panel- no sir
0
 

Author Comment

by:ravenrx7
ID: 38388216
hit i ht any key it trying to boot from the NIC, which i have a PXE setup, went into BIOS order is HD first
0
 
LVL 47

Accepted Solution

by:
David earned 2000 total points
ID: 38388218
You lost 1 HDD from your RAID config.  Here is your risk
- If you have any unrecoverable read errors then you will get partial data loss.
- A stressful rebuild will be required.  The hardware is old enough so that you have a real risk of 100% data loss in event one of the remaining disks is in stress.

So reasonable practice assuming you want to do least risk and you don't have decent backup
1. Turn it off.
2. Get a disk large enough for a complete backup.  To save time get an internal eSATA drive and controller or USB3 (not USB2)
3. get a replacement HDD.
4. Kick off full backup, just tell it to continue booting. It will beep and complain.  It MAY even just die.
5. After backup completes, and system is booted to whatever O/S is supposed to be running, replace the failed disk, and it SHOULD automatically rebuild over next day or so.

Better practice would be to image all the disks using a non-RAID controller to scratch drives, after running diagnostics to assess the health.  This is non-trivial, and you'll have to spend money.  

Best practice is to leave it off, and take all the disks to a pro, assuming you want to spend $4000+ and the data is well worth the expense.
0
 

Author Comment

by:ravenrx7
ID: 38388222
i have 5 drives
0
 

Author Comment

by:ravenrx7
ID: 38388233
I do have a full backup we run Acronis Server backup, so do I procedd with a RAID reapir?
0
 
LVL 3

Expert Comment

by:Heritage02Rider
ID: 38388235
Assuming you did just a reboot and did not remove and replace any drive; did you try the standard fix?

Shutdown the server
pull the power cords (all of them)
eject drive 0
replace drive 0
eject drive 1
replace drive 1
continue with all the drives
reconnect power
boot up
0
 

Author Comment

by:ravenrx7
ID: 38388238
no spending that cash on that is not an option, rather stupid idea. come on. I have a RAID 5 system and you're saying throw in the towel take it some where? come on
0
 
LVL 16

Expert Comment

by:joinaunion
ID: 38388239
These are your options,if you have ROMB KEY installed select Raid otherwise select SCSI

Embedded RAID Controller
      

Selects between RAID Enabled, SCSI Enabled, or Off. The configurable options vary, depending on whether the optional ROMB key and memory are installed.

    With the ROMB key and memory module installed — Select either RAID Enabled or Off.
    Without the ROMB key and memory module installed — Select either SCSI Enabled or Off.

Channel A and Channel B operate independently. If the Channel A displays RAID Enabled, Channel B can be set to RAID Enabled, SCSI Enabled, or Off.
0
 

Author Comment

by:ravenrx7
ID: 38388245
let me try that.. i have no reconnected the drives, I havent made any changes in proabbly 3 months
0
 
LVL 3

Expert Comment

by:Heritage02Rider
ID: 38388264
another option if you get no love from reseating the drives is to remove one drive and reboot. If you get a different error; like no RAID set, reinsert the drive and move on to the next. When you get a degraded state without an error it should boot. Most likely that drive is bad and/or inconsistent. Once booted in the degraded state, insert the drive, it should rebuild it automatically. I have done this before with a drive that was at the beginning state of failure. Once I determined the drive that was failing, I replaced it.
0
 

Author Comment

by:ravenrx7
ID: 38388270
the resets didnt works.

join--so i have channel a raid channel b scsi
0
 

Author Comment

by:ravenrx7
ID: 38388278
k Hertiage let me try that
0
 
LVL 47

Expert Comment

by:David
ID: 38388295
ravenrx - you misunderstand.   I am saying that IF you do not have a backup, then those drives are years out of their designed lifespan of 3 years.  They may not survive a rebuild.  

So if you kick off a rebuild, and have a hard crash, nothing in the world will be able to get it all back, and you will pay $4000+ for just part of the data.

Or you can spend $4000 and probably get 99.999% of your data.

Or pay $500 - $1000 and take image backups of the raw disks first, just in case.

If you do have a currrent backup, no worries. Tell it to continue and let it boot up in degraded mode. If it does NOT boot in degraded mode, you have metadata failure or mis-match.  A dead battery can cause a mismatch.  Either way, default will be to use the data on the disks, not the NVRAM that is likely unprotected due to dead battery.

Bottom line, tell it to go ahead and continue to boot. No need to do anything in the RAID menu. See if it boots .. (But realize you have maybe 25% risk of losing all you have on the hard drives due to the age of the equipment alone).

If you have never taken and TESTED a full backup and recovery, then I would hire somebody to go onsite.  Too many variables here and you certainly have a multiple failure scenario already.
0
 

Author Comment

by:ravenrx7
ID: 38388308
trying the drives swapping and rebooting, now
0
 

Author Comment

by:ravenrx7
ID: 38388317
dlthe- yeah it wont boot i do get the mismatched error- i think it shows in one of the images i uplaoded.. battery like the CMOS?
0
 
LVL 9

Expert Comment

by:bharrington83
ID: 38388334
battery like the one on the RAID controller.
0
 
LVL 47

Expert Comment

by:David
ID: 38388339
Absolutely it is the battery.  I have some old CERCs.  Battery good for 5-7 years typically.   But you have to get the system online and fully operational with all disks before you change out battery.    I think it is that standard 2012 you can buy at any drug store.  Can't remember.
0
 

Author Comment

by:ravenrx7
ID: 38388340
can i replace that battery?
0
 

Author Comment

by:ravenrx7
ID: 38388343
yeah this server has been up and running since jan 2005 LOL!
0
 

Author Comment

by:ravenrx7
ID: 38388356
dang went through the all the drives, swapped rebooted and none went ahead and booted to the OS,
0
 
LVL 9

Expert Comment

by:bharrington83
ID: 38388364
You may be able to replace the baterry. Call dell support.  They will take the call, and sell you the battery.  But you still may have to recover from backup.
0
 
LVL 47

Expert Comment

by:David
ID: 38388368
THat is why you never saw the problem until now.  But really, take it from me, those disks are high risk of dying.  A parity rebuild is most stress those drives have had since the system was installed and you really do have high risk of a 2nd HDD failure during a rebuild.

I have first-hand experience with drives dying on this very system and controller.  Rather annoying because while the data was protected, had to rebuild the O/S.  <sigh>

It just isn't worth throwing more money at those drives or even system.  You're probably getting 30 whole MB/sec write speed too on that RAID5 with that controller.
0
 

Author Comment

by:ravenrx7
ID: 38388402
so none of those raid options i need to do?
0
 

Author Comment

by:ravenrx7
ID: 38388411
yeah sad thing is we have a new coming in on the 24th,
0
 
LVL 47

Expert Comment

by:David
ID: 38388424
The battery is a secondary issue and is not preventing the system from booting.   You have at least 3 failures all working together to give you grief
 - metadata mismatch
 - No NVRAM, so no event log
 - probable disk failure
 - probable unreadable blocks on surviving disks
 - near certain XOR parity mismatches
 - Data corruption (highly probable, as the system was already w/o NVRAM due to dead battery, so you did lose at least a little bit of data
 - I doubt you have ever done a consistency check or scan/repair bad blocks - Well, no way you have.. So this means you have XOR parity issues and certainly unreadable blocks. it is almost a statistical impossibility not to have them.

You need to at least image each disk to scratch drives with a non-RAID controller, and try to reconstruct with that using some commercial software from runtime.org, but that software will NOT deal well with read errors, or a partial parity rebuild - no way of knowing due to the dead battery.

If the data isn't backed up, you need to pay somebody to at least try to do this for you, unless it all makes perfect sense and you have the hardware to do it.  Maybe $1000 to reconstruct it all to a single non-raid disk drive you can boot, with minimal damage.   (After they run hardware diagnostics to assess the risk).

No way could i talk you through this, as you don't have the software anyway.
0
 
LVL 9

Expert Comment

by:bharrington83
ID: 38388430
It knew it was about to be replaced.  That's how they roll.  Try calling dell tech support.  you still have phone access with them.
0
 
LVL 47

Expert Comment

by:David
ID: 38388433
support.dell.com, there is a phone number on the screen,
0
 

Author Comment

by:ravenrx7
ID: 38388471
yeah I have an old dell 1800 server im inthe process of restoring my image just so i can get the users up and running on thi, that works when the new server comes in ill just restore to it, in the meanwhile ill have this 2850 with 5 drives being not used.. i cant just recontructed it,  or just say screw it.
0
 
LVL 16

Expert Comment

by:joinaunion
ID: 38388871
You may want to give this a read,
http://support.dell.com/support/edocs/systems/pe2850/multlang/ROMB/F6586A00.pdf

What happens if you set both channels to SCSI?
0
 
LVL 47

Expert Comment

by:noxcho
ID: 38389943
Rebuild the array and restore from backup. You loose more time and nerves with these attempts to fix the RAID than you would loose trying to restore from backup.
0
 
LVL 56

Expert Comment

by:andyalder
ID: 38390322
0 physical drives detected, so it could be backplane power missing, data cable disconnected or one disk pulling the whole bus down.

I would power up with just one disk installed and see if it spins up and gets detected. It won't rebuild of course with 1 out of 5 disks present but it'll give you more info, then if it is one naughty disk you can power on with 4/5 of them present and rebuild from a normal degraded state. If not at least you have a backup available.
0
 

Author Comment

by:ravenrx7
ID: 38390637
I dont know if this matters but this mornig came booted up the server ( crashed server) and the first two bays have amber lights blinking..
0
 

Author Comment

by:ravenrx7
ID: 38390711
when i went to rebuild, I show 5 failed drives.. so i need to replace that battery right
0
 
LVL 3

Expert Comment

by:Heritage02Rider
ID: 38390727
Sounds like a double disk failure. Very rare indeed. A restore is probably your only option now. You will need to get two new disks.

If it booted this morning, perhaps the drives were cool and are overheating. Really doesn't matter, they will fail again.
0
 

Author Comment

by:ravenrx7
ID: 38390740
oh it didnt boot, i just noticed the blinking leds which werent on yesterday, so its possible, i replace those two disk and it still wont boot right
0
 
LVL 56

Expert Comment

by:andyalder
ID: 38390781
Does it still say "0 physical drives found on the host adapter" during POST?
0
 
LVL 3

Expert Comment

by:Heritage02Rider
ID: 38390784
Correct. RAID 5 can tolerate a single disk failure. The extra drive is used for parity. Actally the parity is spread across the drives, but two drive failure means it won't be able to find the data.
0
 
LVL 47

Expert Comment

by:noxcho
ID: 38390790
1. Replace the battery
2. Replace the drives
3. If does not help - rebuild.
0
 

Author Comment

by:ravenrx7
ID: 38390856
ok understand now, just a question..someone was talking about RAID or SCSI i took a picture of this, is there a ay to move this over to SCSI? would that work
IMAG0863.jpg
0
 
LVL 56

Expert Comment

by:andyalder
ID: 38390931
Still all drives seen as failed/offline I see, do you have a spare disk to test with?
0
 

Author Comment

by:ravenrx7
ID: 38390945
I dont .. ughh.. I'll order a few disk, replace the battery which is on the RAID card right , rebuild array and then restore backup, man what a pain!
0
 
LVL 56

Expert Comment

by:andyalder
ID: 38390966
Since none of the disks are seen the data on them may be intact, what you are seeing is the same as if the SCSI cable had come off. I'd still power it on with just one disk in and see if it sees it (obviously dont accept to change any settings with just one disk it's just for test).
0
 
LVL 16

Expert Comment

by:joinaunion
ID: 38391956
If the ROMB battery has indeed failed then no you can't use SCSI. Its dependent on the battery also.
You said there were flashing lights ,what color were they?
Here is a list of what the lights mean for troubleshooting.
http://support.dell.com/support/edocs/systems/pe2850/en/it/t1393c20.htm#wp1039173

Run system diagnostics instructions here.
http://support.dell.com/support/edocs/systems/pe2850/en/it/t1393c40.htm#wp1033246
0
 
LVL 16

Expert Comment

by:joinaunion
ID: 38403767
Hows things going?
0
 
LVL 16

Expert Comment

by:joinaunion
ID: 38425230
You still with us?
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

As cyber crime continues to grow in both numbers and sophistication, a troubling trend of optimization has emerged over the last year.
Arrow Electronics was searching for a KVM  (Keyboard/Video/Mouse) switch that could display on one single monitor the current status of all units being tested on the rack.
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question