VMware SAN Crash, now DC BSOD's on boot

ESXi 4.1

My SAN crashed (another issue) and VMWare could not start my DC VM.

A call to VMWare support sees the VM now able to boot, apparrently another one of the disks attached was causing it to not able to start.

So, it now starts, but Windows will not boot. It flashes up a BSOD too quick to note the error message.

The only other thing I can so is a Startup Repair but that only gives me options for CMD, Memory Check or Image Recovery (no recovery images created).

VMWare have washed their hands of it as the VM technically does boot/start.

Any ideas? I cant just reinstall it, it is the DC with AD etc all on it.
LVL 1
hongeditAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

hongeditAuthor Commented:
Ok I got the BSOD message:

STOP: c00002e2 Directory Services could not start because of the following error. A device attached to the system is not functioning.

0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
It could be a corrupted virtual disk?
0
JJunkins1Commented:
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

hongeditAuthor Commented:
hancooka - Maybe. But what can I do about it?

Junkins1 - All of the OS and AD are located on the same drive.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
If the virtual disk contents have been corrupted, you would use the same procedures as used on a typical physical machine.

the active directory database, could also be corrupted.
0
millardjkCommented:
Can you attach the misbehaving vmdk to another "known good" VM and run a check disk against it? On physical hardware, the BSOD is an indicator of hard drive failure, so you may be suffering from the virtual equivalent.

Is this the only DC on your network? Can you get it to come up in "directory services recovery mode?"

I had a similar issue with a host in the past, and I discovered that the actual issue was in the configuration for the VM's drives, and upon booting, the Windows kernel assigned different drive letters than before, effectively hiding some of the volumes that were needed for full boot. Either resetting the drive connections OR revising the boot.ini to reflect the changes gave me the fix.

And if you have another DC on the network, I'd consider punting the possibly corrupted VMDK and go through the process of forcibly removing the DC--seizing FSMO roles as necessary--and setting up another one from scratch.
0
hongeditAuthor Commented:
The DC can be booted into DRSM. It also ran a dskchk which also passed fine.

One thing now you mention it is that when booting into Windows Startup Repair, Windows does report the C drive as E or X or something else.

No other DC in the network.

I have removed all disks except for the OS  vmdk, what else can I do to "reset" it?

0
hongeditAuthor Commented:
I can also add the VMDK to another server and all looks ok (can browse etc)
0
millardjkCommented:
While in DSRM, you can pull up Disk Manager and veiry/reset all the drive letters. That could be your savior.
0
hongeditAuthor Commented:
Booting into DRSM now
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
BEFORE YOU ATTEMPT ANYTHING MAKE SURE YOU HAVE A BACKUP OF THIS VMDK!

(not a snapshot! CLONE will do!)
0
hongeditAuthor Commented:
Poop, it all says C: which is right.

What else can I do in DRSM to try and find/rectify the issue?
0
hongeditAuthor Commented:
Good call.

How do I clone? No access to vCentre at the moment...
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
No Access to vCenter, you'll have to do the manual way, copy and paste, using Datastore browser.
0
millardjkCommented:
Not quite the same issue, but you might learn some things by using the steps in http://support.microsoft.com/kb/258062
0
hongeditAuthor Commented:
Ok copying now. Will take a while by the looks of it.

Anything eklse I can do?

Being able to get into DRSM and see files etc gives me hope that this can be recovered with minimal data loss...
0
hongeditAuthor Commented:
Ok, if I run ntdsutil files I get the error:

Could not initialize the Jet engine: Jet Error -501.

Failed to open DIT for AD/LDS instance NTDS. Error - 214748113
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Looks like your AD database is corrupt.

Do you only have 1 DC?
0
JJunkins1Commented:
You could always attach the corrupted VMDK to another VM, browse the drive and use some of our links to replace the corrupted files from one of your known good working VM's.
0
hongeditAuthor Commented:
Han: Yes only 1 DC.

Junkins: I dont think any of the other VM's will have the required AD files...?
0
JJunkins1Commented:
Did you try this?

1. Start DSRM - (F8). - Yes you did this...

2. Search log files (edb*.log) that the size are not zero byte.  Log files might be found under the SYSVOL Folder (C:\WINDOWS\SYSVOL\domain in above example) or root folder of the log drive

3. Move the log files into the original log folder (In the above example, move log files from C:\WINDOWS\SYSVOL\domain or C:\ to C:\WINDOWS\NTDS\).

4. Reboot the system again.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
JJunkins1Commented:
Forget the words "in the above example" in my post... above.
0
hongeditAuthor Commented:
FIXED

Called into MS Support.

Booted into DSRM, and deleted all but 2 files in the NTDS folder (ntds.jit and edb.chk)

Reboot, voila.

He's just checking AD now but looks fine.
0
JJunkins1Commented:
Awesome!!
0
millardjkCommented:
Congratulations. Now, plan to get a #2 DC running on a different storage platform (if on VMware) or bare metal. Your life will be much happier in that state.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Now create a 2nd DC!
0
hongeditAuthor Commented:
Called MS SUpport
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.