Windows Server 2003 won't boot up

I have a Dell Power Edge SC 4130 server running Windows Server. Last Wednesday we had a power event of some sort (probably a spike on the grid) which killed the UPS. I swapped the UPS and started the server, worked without a hitch. On Friday afternoon I logged in via RDP and shut down the machine without installing updates, planning to move server to another location within the building.

I moved it today, restarted the machine, and now I'm getting

Windows could not start because the following file is missing or corrupt

Windows could not start because the following file is missing or corrupt

What is the safest / least intrusive way of fixing the error (I'm looking for minimum downtime, but also to avoid having to reinstall OS/ policies / restore files, etc). This server is the single DC on a small office network.

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Gerwin Jansen, EE MVETopic Advisor Commented:
Hello Ronino,

Easiest is that you copy the 2 files from the Windows 2003 server installation CD.

Boot from the CD, from 'Welcome' screen select 'R' for Recovery Console.

From your C drive do this:

cd \windows\system32\drivers
ren ntfs.sys ntfs.old
ren ntoskrnl.sys ntoskrnl.old

copy <cd letter>:\i386\ntfs.sys C:\windows\system32\drivers
copy <cd letter>:\i386\ntoskrnl.sys C:\windows\system32\drivers

Open in new window

In case the ntfs.sys and ntoskrnl.sys are compressed on the CD, they will be named ntfs.sy_ and ntoskrnl.sy_. In that case you have to change the copy commands and expand the 2 files after copying this way:

expand C:\windows\system32\drivers\ntfs.sy_ C:\windows\system32\drivers\ntfs.sys
expand C:\windows\system32\drivers\ntoskrnl.sy_ C:\windows\system32\drivers\ntoskrnl.sys

Open in new window

Now you can reboot your system and see if it starts again.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
First of all I'd make sure all HD's and adapters are properly attached to the system. Maybe something got dislodged when you moved it (possibly just taking out the HD's and inserting them again might fix it).
It probably did not shut down correctly. Boot into recovery or BartPE and run chkdsk c: /f
Acronis True Image 2019 just released!

Create a reliable backup. Make sure you always have dependable copies of your data so you can restore your entire system or individual files.

I agree with Rindi. It sounds like a cable may have come disloged.
RoninoAuthor Commented:

I opened the box and checked / unplugged / plugged back in cables, no luck.

I tried to boot from Bart PE, it does not recognize the SCSI or RAID controller, so I can' do anything on the hard drive. I also tried Knoppix, UBCD and the Win 2003 install disc - none of them recognized the RAID controller.

I can get the controller drivers from the Dell support site, but the server doesn't have a floppy disc, so I'm not sure how I can try to boot and load the RAID drivers.
But the server with the OpenManage CD or DVD it originally was delivered with, or if you don't have it anymore you should also be able to download it from the Dell site. There should be Diags on that CD/DVD, and also a tool that shows you the RAID's state (you should also be able to get some idea on the array's state via the function key combination at bootup.

A tool that might be able to see the RAID controller could be PartedMagic. It allows you to boot the system with different options, so if the first can't see the controller or the disks try the others...:
"But the server" should of course read "Boot the server" above...
Gerwin Jansen, EE MVETopic Advisor Commented:
>> Dell Power Edge SC 4130
Sure that is the correct type? I'm having trouble finding it at the Dell site...
RoninoAuthor Commented:
I'm running 2 HDDs in RAID 1.

When booting the system I'm getting

integrated RAID exception detected:
 volume 00:000 state is RESYNCING

I slipstreamed a Win2003 install disk with the RAID drivers and it recognized the HDD, but when I hit R ro go into Repair mode it said "Examining 476836 MB Disk 0 at ID 0 on bus 0 on lsi_sas..." - and it's been stuck on that for about 30 mins.
RoninoAuthor Commented:
Dell Power Edge SC 1430, my mistake, thanks for noticing
I would recommend letting it resync before continuing.  It may take awhile with the disk size.
You can add the raid disk drivers to bartpe by copying 32bit drivers to \Drivers\Scsiadapter folder.
It does not matter if the OS is 64-bit, this is a driver for the boot cd.
RoninoAuthor Commented:
I booted the PartedMagic LIVE CD, it showed my RAID-1 500Gb disks as a single "500Gb Virtual Disk" - couldn't read anything off it, tell me health or partition status.

I know the RAID disks are still alive, though - Win Server 2003 boots halfway then jumps to a cold reboot. I have tried to start it in Safe Mode, and it gets stuck after loading a number of drivers.
you need to run CHKDSK from a recovery CD.

Recompile the bartpe cd with the raid drivers.
As mentioned above, wait for the resync to finish. After that chances are things will be OK. If not, and if also the chkdsk doesn't help (maybe you can boot into command prompt to run that if you don't want to add drivers to a CD), then you can try the PartedMagic CD again. As it sees the individual disks and not the array it looks as if your RAID controller is a fakeraid or software RAID controller. Maybe if you use the RAID boot options of the CD it might be able to access the data.
RoninoAuthor Commented:
Rindi, the SAS Utility and BIOS prompts don't tell me what the resync status is, and how much longer it will take - any leads where I might be able to find that info?
Maybe the OpenManage DVD can give more info. But as has already been mentioned with 500GB Disks it can take quite some time.
Just out of curiosity, what kind of raid controller are you using, and do you have a spare?

It sounds like the controller might not be responding correctly, and it might be easier to replace it than try to use it in this instance.

If you have an exact replacement, it might be a fast way to test the controller, if not, than maybe leave that for later.
RoninoAuthor Commented:

The server uses a SAS 5IR controller, and I do not have a spare.

After troubleshooting with Dell tech support it turns out
- HDD-0 has block errors, replacement is on the way
- HDD-1 one is physically NOT damaged, however
- if I unplug one of the HDDs the RAID controller says "no boot device found";
- the RAID controller does not offer the option to disable one of the hard drives from the controller BIOS.

Basically even though I have a RAID-1 array the techs tell me that once one of the HDDs has failed, the other one is useless! Incredibly frustrating, it defeats the whole purpose of RAID.

I also tried to plug in the good HDD in a SATA-to-USB controller that I use to recover data from dead workstations. The drive is a WD5001ABYS, which the techs described as an "enterprise grade" HDD. I don't know what the difference is between enterprise and consumer (other than error tolerance and reliability expectations), however the drive did not work. It did not even spin when plugged into the adapter, seems it needs more power than most HDDs. So I don't have any way of recovering data from that drive either.

My biggest concern now is that Dell techs tell me that even though HDD-1 is not damaged, when I receive and plug in HDD-2 to replace the failed drive, I may STILL have to reinstall the operating system and rebuild the server from scratch. Anybody else had a similar experience / found a workaround? We have a RAID server after all...
You could try to set the boot mark in the controller bios to the functional drive you have, DO NOT break the mirror in windows, use the controller bios.

If you unplug the bad drive, reboot, enter raid bios, set boot device, you should be back up and running. otherwise, you might want to get a better raid controller, as that one sounds like a lemon. This is exactly the reason to have a mirror, and if the controller can't handle the task like a 50$ soft raid card, dump it.

The ABYS series is a 5 year warranty, high MTBF drive. The power requirements should not be very different from most 7200 rpm drives. (solid drives, I like them)

Sounds like you're going to dell in a dandbasket.
An alternate workaround, remove the working drive, and clone it on a different machine, then introduce the clone as a standalone drive to the controller.

Time consuming, but maybe worth it, depending on the relative downtime for the server.
Gerwin Jansen, EE MVETopic Advisor Commented:
>> Basically even though I have a RAID-1 array the techs tell me that once one of the HDDs has failed, the other one is useless! Incredibly frustrating, it defeats the whole purpose of RAID.

Wow! I'd be looking for another RAID controler a.s.a.p. Even a 'simple' Netgear ReadyNAS will run fine with 1 defective disk.
To restore a missing or corrupt ntfs.sys file you must have the Windows 2k3 CD and follow the below steps.

1.Insert the Windows win 2k3 CD into the computer and restart the computer.
2.As the computer is starting make sure to press a key to boot from the CD.
3.In the Windows 2k3 setup screen press the 'R' key to run the Windows Recovery Console.
4.If prompted enter the number of Windows installation you're repairing.
5.At the command prompt type the below command.

copy x:\i386\ntfs.sys c:\windows\system32\drivers

* In the above example you would replace x: with the letter of your CD-ROM drive. Many computers have the CD-ROM drive configured as the D: drive.
6.If ntfs.sys is still on the computer you'll be prompted if you wish to overwrite the file or change the old file name before coping. If prompted, press the Y key for Yes to overwrite the file.
7.Once the file has been successfully copied remove the CD and reboot your computer.
Sigh, you need to make a emergency boot with with the raid drivers in it (Like BartPE) and run chkdsk. The missing or corrupt file errors will most likely go away. I have seen this at least 50 times on workstations and servers.
RoninoAuthor Commented:
The things that became apparent from my adventure:

1) some Dell technicians have limited knowledge of the systems they support. They've been pushing me to "format and reinstall OS" from Day 1. It wasn't necessary.

2) Some RAID controllers should never make it to market. Dell SAS 5/iR, in this instance.
     - it did not warn me when one of the drives was failing
     - it did not work AT ALL with 1 good drive and one bad drive, not even enough to allow me to troubleshoot
     - it was difficult to reconfigure once the bad drive was replaced
     - it could not resync the array to the replacement drive AND allow the system to start at the same time. But it didn't warn me of that either. Until I figured out I need to leave it in RAID BIOS mode for 6 hours to resync, I really thought I'd have to reinstall the OS

I was finally able to replace the HDD, enter RAID BIOS, revive the array, resync and then reboot the system. I'm still getting some RAID related warning in the Windows Event log, but all is good in the world, no data loss, no damage to the OS, it's as if nothing ever happened.

Thanks all for your help
RoninoAuthor Commented:
PS. after replacing HDD and resyncing array, HDDs passed CHKDSK with flying colors, NTFS.SYS and NTOSKRN did not need to be replaced.
RoninoAuthor Commented:
Thanks all for your help
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2003

From novice to tech pro — start learning today.