Link to home
Start Free TrialLog in
Avatar of Ronino
Ronino

asked on

Windows Server 2003 won't boot up


I have a Dell Power Edge SC 4130 server running Windows Server. Last Wednesday we had a power event of some sort (probably a spike on the grid) which killed the UPS. I swapped the UPS and started the server, worked without a hitch. On Friday afternoon I logged in via RDP and shut down the machine without installing updates, planning to move server to another location within the building.

I moved it today, restarted the machine, and now I'm getting

Windows could not start because the following file is missing or corrupt
System32\Drivers\Ntfs.sys

or
Windows could not start because the following file is missing or corrupt
Root>System32\Drivers\Ntoskrnl.sys

What is the safest / least intrusive way of fixing the error (I'm looking for minimum downtime, but also to avoid having to reinstall OS/ policies / restore files, etc). This server is the single DC on a small office network.

Thanks
ASKER CERTIFIED SOLUTION
Avatar of Gerwin Jansen
Gerwin Jansen
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
First of all I'd make sure all HD's and adapters are properly attached to the system. Maybe something got dislodged when you moved it (possibly just taking out the HD's and inserting them again might fix it).
It probably did not shut down correctly. Boot into recovery or BartPE and run chkdsk c: /f
I agree with Rindi. It sounds like a cable may have come disloged.
Avatar of Ronino
Ronino

ASKER


I opened the box and checked / unplugged / plugged back in cables, no luck.

I tried to boot from Bart PE, it does not recognize the SCSI or RAID controller, so I can' do anything on the hard drive. I also tried Knoppix, UBCD and the Win 2003 install disc - none of them recognized the RAID controller.

I can get the controller drivers from the Dell support site, but the server doesn't have a floppy disc, so I'm not sure how I can try to boot and load the RAID drivers.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
"But the server" should of course read "Boot the server" above...
>> Dell Power Edge SC 4130
Sure that is the correct type? I'm having trouble finding it at the Dell site...
Avatar of Ronino

ASKER

I'm running 2 HDDs in RAID 1.

When booting the system I'm getting

integrated RAID exception detected:
 volume 00:000 state is RESYNCING

I slipstreamed a Win2003 install disk with the RAID drivers and it recognized the HDD, but when I hit R ro go into Repair mode it said "Examining 476836 MB Disk 0 at ID 0 on bus 0 on lsi_sas..." - and it's been stuck on that for about 30 mins.
Avatar of Ronino

ASKER

Dell Power Edge SC 1430, my mistake, thanks for noticing
I would recommend letting it resync before continuing.  It may take awhile with the disk size.
You can add the raid disk drivers to bartpe by copying 32bit drivers to \Drivers\Scsiadapter folder.
It does not matter if the OS is 64-bit, this is a driver for the boot cd.
Avatar of Ronino

ASKER

I booted the PartedMagic LIVE CD, it showed my RAID-1 500Gb disks as a single "500Gb Virtual Disk" - couldn't read anything off it, tell me health or partition status.

I know the RAID disks are still alive, though - Win Server 2003 boots halfway then jumps to a cold reboot. I have tried to start it in Safe Mode, and it gets stuck after loading a number of drivers.
you need to run CHKDSK from a recovery CD.

Recompile the bartpe cd with the raid drivers.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Ronino

ASKER

Rindi, the SAS Utility and BIOS prompts don't tell me what the resync status is, and how much longer it will take - any leads where I might be able to find that info?
Maybe the OpenManage DVD can give more info. But as has already been mentioned with 500GB Disks it can take quite some time.
Just out of curiosity, what kind of raid controller are you using, and do you have a spare?

It sounds like the controller might not be responding correctly, and it might be easier to replace it than try to use it in this instance.

If you have an exact replacement, it might be a fast way to test the controller, if not, than maybe leave that for later.
Avatar of Ronino

ASKER


The server uses a SAS 5IR controller, and I do not have a spare.

After troubleshooting with Dell tech support it turns out
- HDD-0 has block errors, replacement is on the way
- HDD-1 one is physically NOT damaged, however
- if I unplug one of the HDDs the RAID controller says "no boot device found";
- the RAID controller does not offer the option to disable one of the hard drives from the controller BIOS.

Basically even though I have a RAID-1 array the techs tell me that once one of the HDDs has failed, the other one is useless! Incredibly frustrating, it defeats the whole purpose of RAID.

I also tried to plug in the good HDD in a SATA-to-USB controller that I use to recover data from dead workstations. The drive is a WD5001ABYS, which the techs described as an "enterprise grade" HDD. I don't know what the difference is between enterprise and consumer (other than error tolerance and reliability expectations), however the drive did not work. It did not even spin when plugged into the adapter, seems it needs more power than most HDDs. So I don't have any way of recovering data from that drive either.

My biggest concern now is that Dell techs tell me that even though HDD-1 is not damaged, when I receive and plug in HDD-2 to replace the failed drive, I may STILL have to reinstall the operating system and rebuild the server from scratch. Anybody else had a similar experience / found a workaround? We have a RAID server after all...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
An alternate workaround, remove the working drive, and clone it on a different machine, then introduce the clone as a standalone drive to the controller.

Time consuming, but maybe worth it, depending on the relative downtime for the server.
>> Basically even though I have a RAID-1 array the techs tell me that once one of the HDDs has failed, the other one is useless! Incredibly frustrating, it defeats the whole purpose of RAID.

Wow! I'd be looking for another RAID controler a.s.a.p. Even a 'simple' Netgear ReadyNAS will run fine with 1 defective disk.
To restore a missing or corrupt ntfs.sys file you must have the Windows 2k3 CD and follow the below steps.

1.Insert the Windows win 2k3 CD into the computer and restart the computer.
2.As the computer is starting make sure to press a key to boot from the CD.
3.In the Windows 2k3 setup screen press the 'R' key to run the Windows Recovery Console.
4.If prompted enter the number of Windows installation you're repairing.
5.At the command prompt type the below command.

copy x:\i386\ntfs.sys c:\windows\system32\drivers

* In the above example you would replace x: with the letter of your CD-ROM drive. Many computers have the CD-ROM drive configured as the D: drive.
 
6.If ntfs.sys is still on the computer you'll be prompted if you wish to overwrite the file or change the old file name before coping. If prompted, press the Y key for Yes to overwrite the file.
7.Once the file has been successfully copied remove the CD and reboot your computer.
Sigh, you need to make a emergency boot with with the raid drivers in it (Like BartPE) and run chkdsk. The missing or corrupt file errors will most likely go away. I have seen this at least 50 times on workstations and servers.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Ronino

ASKER

PS. after replacing HDD and resyncing array, HDDs passed CHKDSK with flying colors, NTFS.SYS and NTOSKRN did not need to be replaced.
Avatar of Ronino

ASKER

Thanks all for your help