Link to home
Start Free TrialLog in
Avatar of Craig Beamson
Craig BeamsonFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Dell PowerEdge 830 boot problem - how to troubleshoot

My wife's small business has a Dell PowerEdge 830 which has stopped booting normally and is now way out of warranty with DELL and needs troubleshooting quickly and inexpensively!

The two attached screenshots were taken using snapshots from the BIOS as the server was accessed using it's remote access card (DRAC).

The BIOS procedes to the point of offering the options "Strike the F1 key to continue, F2 to run the setup utility".  (Pressing F1 just changes the message to "Strike the F1 key to retry...")

The server has 3 SATA drives and at this stage, I do not believe there is a RAID controller in effect.  The server was running Windows 2003 Server

The setup utility (F2) shows nothing obviously wrong.  The 3 SATA drives are shown (see screenshot 3 and the boot order starts with "drive C"

I am contemplating borrowing a PC with a SATA controller and swapping the server drives to it as a short term fix but am not wholly confident this will work.

What steps can I take to identify what the problem is (or how can I safely get a short term fix without risk of losing data)?
2009-04-20-serverprob001.jpg
2009-04-20-serverprob002.jpg
2009-04-20-serverprob003.jpg
ASKER CERTIFIED SOLUTION
Avatar of ComputerTechie
ComputerTechie
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of Dusty Thurman
Dusty Thurman
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sorry for my misunderstanding earlier, I was operating off of the description and not the images. Obviously, last known good and safe mode are not an option, but the recovery console is. If the recovery console does not come up, then look at whether or not you get the C: prompt. If not, then the system cannot find the partition. That can be due to communication failure with the drive, such as a cable issue mentioned by ComputerTechie or it could be a corrupted or failed hard drive.
Avatar of Craig Beamson

ASKER

A few extra findings since my first post

I've removed drive 0 and put it into another PC running windows XP.
(Noted the server was very full of dust and about 4 years of fluff)
On boot, it sees the new drive and says the volume is dirty and starts fixing file ids.

Once into the operating system, I can view the contents of the drive (it contains the windows system files (drives C and D on the server).  However, returning the drive to the server, the problem remains.

On the server BIOS, the drive order is SATA1, SATA0, SATA2.
Does this mean that the boot records will be on SATA1?
If so, is there any value in me trying to piggyback SATA1 onto my PC and see if it is readable?
I don't know a great deal about boot records and whether I'd be risking/gaining anything by doing this.
If you do get recovery console to load i would run the command fixboot.
CT
The boot order in the bios is how the boot hand-off is going to load. The bios will direct the boot loader to sata1 first, then sata0, then sata2. Therefore, if there is a boot.ini on each of them, the one on sata1 is the one that will boot unless you specify the boot device with F12.

The other side of this coin is the boot.ini itself. It will have a record that points to which drive has the OS boot files. If you can boot to sata0 on a different PC, then sata0 probably has the boot files and the boot.ini should be pointing there. You can actually view that information in the boot.ini if you can browse the disk.

If the boot.ini and boot files are on sata0, I would recommend having that first in the boot order in the system bios. Also, fixmbr that I mentioned earlier is less likely to be the resolution. The fixboot that I mentioned and ComputerTechie seconded is the more likely utility to help.
More info:

Perhaps I wasn't being observant earlier but I've just noticed warning LEDs on the front of the server.
Whilst on mains power but not turned on the main fascia LEDs alternate green and amber (which the DELL user manual says is that the system has detected an error.

After letting the BIOS run through to the sticking point, the set of four ABCD LEDs read green, green, green. amber

I've detached the CD-ROM IDE cable and power cable
I've removed an unused SCSI card
I've removed the DRAC remote management card (damaging the plastic cable connector slightly in doing so)
I've swapped all SATA cables.

After an attempted boot, the only difference now is that the four ABCD LEDs are now all green.
The main amber LED is still flashing.
I think I have somehow fixed it.

Having removed DRAC card, SCSI card, CD-ROm drive, I still get error message as previously described but the orange LED on the right of the ABCD warning LEDs is now green (though the main amber light is still flashing on and off.

By chance, I pressed the escape key at this point and the system went straight into Windows 2003 server.  Once booted I found I had virtually no space on drive C (I'm investigating this now).

So, I'll return to this post tomorrow and make sure I award points fairly but for now, I think I'm on top of it.
In the end, the simple solution was "unplug everything unneccessary" and ComputerTechie's first comments were nearest the mark.

Thank you both for your help!