hongedit
asked on
VMware SAN Crash, now DC BSOD's on boot
ESXi 4.1
My SAN crashed (another issue) and VMWare could not start my DC VM.
A call to VMWare support sees the VM now able to boot, apparrently another one of the disks attached was causing it to not able to start.
So, it now starts, but Windows will not boot. It flashes up a BSOD too quick to note the error message.
The only other thing I can so is a Startup Repair but that only gives me options for CMD, Memory Check or Image Recovery (no recovery images created).
VMWare have washed their hands of it as the VM technically does boot/start.
Any ideas? I cant just reinstall it, it is the DC with AD etc all on it.
My SAN crashed (another issue) and VMWare could not start my DC VM.
A call to VMWare support sees the VM now able to boot, apparrently another one of the disks attached was causing it to not able to start.
So, it now starts, but Windows will not boot. It flashes up a BSOD too quick to note the error message.
The only other thing I can so is a Startup Repair but that only gives me options for CMD, Memory Check or Image Recovery (no recovery images created).
VMWare have washed their hands of it as the VM technically does boot/start.
Any ideas? I cant just reinstall it, it is the DC with AD etc all on it.
It could be a corrupted virtual disk?
Take a look at this and see if it helps.
http://www.symantec.com/business/support/index?page=content&id=TECH58289
http://www.symantec.com/business/support/index?page=content&id=TECH58289
ASKER
hancooka - Maybe. But what can I do about it?
Junkins1 - All of the OS and AD are located on the same drive.
Junkins1 - All of the OS and AD are located on the same drive.
If the virtual disk contents have been corrupted, you would use the same procedures as used on a typical physical machine.
the active directory database, could also be corrupted.
the active directory database, could also be corrupted.
Can you attach the misbehaving vmdk to another "known good" VM and run a check disk against it? On physical hardware, the BSOD is an indicator of hard drive failure, so you may be suffering from the virtual equivalent.
Is this the only DC on your network? Can you get it to come up in "directory services recovery mode?"
I had a similar issue with a host in the past, and I discovered that the actual issue was in the configuration for the VM's drives, and upon booting, the Windows kernel assigned different drive letters than before, effectively hiding some of the volumes that were needed for full boot. Either resetting the drive connections OR revising the boot.ini to reflect the changes gave me the fix.
And if you have another DC on the network, I'd consider punting the possibly corrupted VMDK and go through the process of forcibly removing the DC--seizing FSMO roles as necessary--and setting up another one from scratch.
Is this the only DC on your network? Can you get it to come up in "directory services recovery mode?"
I had a similar issue with a host in the past, and I discovered that the actual issue was in the configuration for the VM's drives, and upon booting, the Windows kernel assigned different drive letters than before, effectively hiding some of the volumes that were needed for full boot. Either resetting the drive connections OR revising the boot.ini to reflect the changes gave me the fix.
And if you have another DC on the network, I'd consider punting the possibly corrupted VMDK and go through the process of forcibly removing the DC--seizing FSMO roles as necessary--and setting up another one from scratch.
ASKER
The DC can be booted into DRSM. It also ran a dskchk which also passed fine.
One thing now you mention it is that when booting into Windows Startup Repair, Windows does report the C drive as E or X or something else.
No other DC in the network.
I have removed all disks except for the OS vmdk, what else can I do to "reset" it?
One thing now you mention it is that when booting into Windows Startup Repair, Windows does report the C drive as E or X or something else.
No other DC in the network.
I have removed all disks except for the OS vmdk, what else can I do to "reset" it?
ASKER
I can also add the VMDK to another server and all looks ok (can browse etc)
While in DSRM, you can pull up Disk Manager and veiry/reset all the drive letters. That could be your savior.
ASKER
Booting into DRSM now
BEFORE YOU ATTEMPT ANYTHING MAKE SURE YOU HAVE A BACKUP OF THIS VMDK!
(not a snapshot! CLONE will do!)
(not a snapshot! CLONE will do!)
ASKER
Poop, it all says C: which is right.
What else can I do in DRSM to try and find/rectify the issue?
What else can I do in DRSM to try and find/rectify the issue?
ASKER
Good call.
How do I clone? No access to vCentre at the moment...
How do I clone? No access to vCentre at the moment...
No Access to vCenter, you'll have to do the manual way, copy and paste, using Datastore browser.
Not quite the same issue, but you might learn some things by using the steps in http://support.microsoft.com/kb/258062
ASKER
Ok copying now. Will take a while by the looks of it.
Anything eklse I can do?
Being able to get into DRSM and see files etc gives me hope that this can be recovered with minimal data loss...
Anything eklse I can do?
Being able to get into DRSM and see files etc gives me hope that this can be recovered with minimal data loss...
ASKER
Ok, if I run ntdsutil files I get the error:
Could not initialize the Jet engine: Jet Error -501.
Failed to open DIT for AD/LDS instance NTDS. Error - 214748113
Could not initialize the Jet engine: Jet Error -501.
Failed to open DIT for AD/LDS instance NTDS. Error - 214748113
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
You could always attach the corrupted VMDK to another VM, browse the drive and use some of our links to replace the corrupted files from one of your known good working VM's.
ASKER
Han: Yes only 1 DC.
Junkins: I dont think any of the other VM's will have the required AD files...?
Junkins: I dont think any of the other VM's will have the required AD files...?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Forget the words "in the above example" in my post... above.
ASKER
FIXED
Called into MS Support.
Booted into DSRM, and deleted all but 2 files in the NTDS folder (ntds.jit and edb.chk)
Reboot, voila.
He's just checking AD now but looks fine.
Called into MS Support.
Booted into DSRM, and deleted all but 2 files in the NTDS folder (ntds.jit and edb.chk)
Reboot, voila.
He's just checking AD now but looks fine.
Awesome!!
Congratulations. Now, plan to get a #2 DC running on a different storage platform (if on VMware) or bare metal. Your life will be much happier in that state.
Now create a 2nd DC!
ASKER
Called MS SUpport
ASKER
STOP: c00002e2 Directory Services could not start because of the following error. A device attached to the system is not functioning.