Link to home
Start Free TrialLog in
Avatar of hongedit
hongeditFlag for United Kingdom of Great Britain and Northern Ireland

asked on

VMware SAN Crash, now DC BSOD's on boot

ESXi 4.1

My SAN crashed (another issue) and VMWare could not start my DC VM.

A call to VMWare support sees the VM now able to boot, apparrently another one of the disks attached was causing it to not able to start.

So, it now starts, but Windows will not boot. It flashes up a BSOD too quick to note the error message.

The only other thing I can so is a Startup Repair but that only gives me options for CMD, Memory Check or Image Recovery (no recovery images created).

VMWare have washed their hands of it as the VM technically does boot/start.

Any ideas? I cant just reinstall it, it is the DC with AD etc all on it.
Avatar of hongedit
hongedit
Flag of United Kingdom of Great Britain and Northern Ireland image

ASKER

Ok I got the BSOD message:

STOP: c00002e2 Directory Services could not start because of the following error. A device attached to the system is not functioning.

Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
It could be a corrupted virtual disk?
hancooka - Maybe. But what can I do about it?

Junkins1 - All of the OS and AD are located on the same drive.
If the virtual disk contents have been corrupted, you would use the same procedures as used on a typical physical machine.

the active directory database, could also be corrupted.
Can you attach the misbehaving vmdk to another "known good" VM and run a check disk against it? On physical hardware, the BSOD is an indicator of hard drive failure, so you may be suffering from the virtual equivalent.

Is this the only DC on your network? Can you get it to come up in "directory services recovery mode?"

I had a similar issue with a host in the past, and I discovered that the actual issue was in the configuration for the VM's drives, and upon booting, the Windows kernel assigned different drive letters than before, effectively hiding some of the volumes that were needed for full boot. Either resetting the drive connections OR revising the boot.ini to reflect the changes gave me the fix.

And if you have another DC on the network, I'd consider punting the possibly corrupted VMDK and go through the process of forcibly removing the DC--seizing FSMO roles as necessary--and setting up another one from scratch.
The DC can be booted into DRSM. It also ran a dskchk which also passed fine.

One thing now you mention it is that when booting into Windows Startup Repair, Windows does report the C drive as E or X or something else.

No other DC in the network.

I have removed all disks except for the OS  vmdk, what else can I do to "reset" it?

I can also add the VMDK to another server and all looks ok (can browse etc)
While in DSRM, you can pull up Disk Manager and veiry/reset all the drive letters. That could be your savior.
Booting into DRSM now
BEFORE YOU ATTEMPT ANYTHING MAKE SURE YOU HAVE A BACKUP OF THIS VMDK!

(not a snapshot! CLONE will do!)
Poop, it all says C: which is right.

What else can I do in DRSM to try and find/rectify the issue?
Good call.

How do I clone? No access to vCentre at the moment...
No Access to vCenter, you'll have to do the manual way, copy and paste, using Datastore browser.
Not quite the same issue, but you might learn some things by using the steps in http://support.microsoft.com/kb/258062
Ok copying now. Will take a while by the looks of it.

Anything eklse I can do?

Being able to get into DRSM and see files etc gives me hope that this can be recovered with minimal data loss...
Ok, if I run ntdsutil files I get the error:

Could not initialize the Jet engine: Jet Error -501.

Failed to open DIT for AD/LDS instance NTDS. Error - 214748113
SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You could always attach the corrupted VMDK to another VM, browse the drive and use some of our links to replace the corrupted files from one of your known good working VM's.
Han: Yes only 1 DC.

Junkins: I dont think any of the other VM's will have the required AD files...?
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Forget the words "in the above example" in my post... above.
FIXED

Called into MS Support.

Booted into DSRM, and deleted all but 2 files in the NTDS folder (ntds.jit and edb.chk)

Reboot, voila.

He's just checking AD now but looks fine.
Awesome!!
Congratulations. Now, plan to get a #2 DC running on a different storage platform (if on VMware) or bare metal. Your life will be much happier in that state.
Called MS SUpport