Server 2012/Exchange 2013 Server Crashes with MSExchangeHMWo

I am trying to stand up an Exchange 2013 setup in preparation for a cross-forest move from a .local domain with Exchange 2007. I am experiencing some issues that I have not seen before and I am not able to get figured out yet. After I have built the servers, created the DAG, etc. the Exchange servers will crash (with a minidump) at around the same time each time they crash.

The culprit seems to be Exchange Health monitoring according to the bugcheck analysis. After finding that out, I realized there is a known issue for certain patch levels of Exchange and the MSExchangeHMWo process crashing. We are running CU9 and still experienced the issue. I applied the global monitoring override solution found at http://itsalwaysmyproblem.com/2013/08/27/exchange-2013-and-bugcheck-0x000000ef/ to see if that would fix the issue. It still crashed on me.

I have also rebuilt the servers from scratch in an attempt to fix the problem but that did not work either. This is non-production at the moment so I can modify whatever needs to be modified.

Here is a little overview of the environment...

AD:
-Forest: Windows 2012
-Domain: Windows 2012
-3 Sites (Exchange is only in one site currently)
-1GB Metro connection between all sites

Exchange Servers:(both are the same)
-VMware 5.5 Patch 5
-All drives are VMDKs backed by EMC VNX with 10K drives or better
-Windows Server 2012 (Ver. 6.2(Build 9200)) with all current updates
-Exchange Server 2013 Standard Ver.15.0 (Build 1104.5)
-DAG set up between both servers
-Only standard system and admin mailboxes are set up currently

Bugcheck Results:
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

CRITICAL_PROCESS_DIED (ef)
        A critical system process died
Arguments:
Arg1: fffffa800ce5a080, Process object
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: 0000000000000000

Debugging Details:
------------------

TRIAGER: Could not open triage file : e:\dump_analysis\program\triage\modclass.ini, error 2

PROCESS_OBJECT: fffffa800ce5a080

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT_SERVER

BUGCHECK_STR:  0xEF

PROCESS_NAME:  MSExchangeHMWo

CURRENT_IRQL:  0

LAST_CONTROL_TRANSFER:  from fffff800b1ffcb91 to fffff800b1ace440

STACK_TEXT:  
fffff880`0aa929a8 fffff800`b1ffcb91 : 00000000`000000ef fffffa80`0ce5a080 00000000`00000000 00000000`00000000 : nt!KeBugCheckEx
fffff880`0aa929b0 fffff800`b1f93eb6 : fffffa80`0ce5a080 00000000`144d2c41 00000000`00000000 fffff800`b1c4b280 : nt!PspCatchCriticalBreak+0xad
fffff880`0aa929f0 fffff800`b1f0a831 : fffffa80`0ce5a080 00000000`144d2c41 fffffa80`0ce5a080 00000000`00000000 : nt! ?? ::NNGAKEGL::`string'+0x48196
fffff880`0aa92a50 fffff800`b1f105de : ffffffff`ffffffff fffffa80`11fef080 fffffa80`0ce5a080 00000000`00000001 : nt!PspTerminateProcess+0x6d
fffff880`0aa92a90 fffff800`b1acd453 : fffffa80`0ce5a080 fffffa80`22fafb00 fffff880`0aa92b80 00000000`ffffffff : nt!NtTerminateProcess+0x9e
fffff880`0aa92b00 000007fd`5ade2e2a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
00000000`2397da88 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x7fd`5ade2e2a


STACK_COMMAND:  kb

FOLLOWUP_IP:
nt!PspCatchCriticalBreak+ad
fffff800`b1ffcb91 cc              int     3

SYMBOL_STACK_INDEX:  1

SYMBOL_NAME:  nt!PspCatchCriticalBreak+ad

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  5507a86c

FAILURE_BUCKET_ID:  X64_0xEF_nt!PspCatchCriticalBreak+ad

BUCKET_ID:  X64_0xEF_nt!PspCatchCriticalBreak+ad

Followup: MachineOwner
---------
David BorgNetwork AdministratorAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

George SasIT EngineerCommented:
Do you have all roles on each server ?
Have you tried not to configure DAG and see if this will help ?
How much ram you have on each server ?
Are the drives on Vmware thick or Thin provisioned ?
Are all your servers virtual ?

I have seen some issues with 2012 server when I gave it 6gb ram or some other amount not multiple of 4gb ... not sure it was only on that scenario or in general.
Also seen problem with thin provisioned disks.

What happens if you run the servers as standalone and not DAG ? Still crash or only after you make the dag ?
David BorgNetwork AdministratorAuthor Commented:
- All roles on both servers
- Not yet. That was my next step. (see info below first)
- 16GB memory on each server
- Thick Lazy provisioning on disks with 64K block size
- All servers related to this are virtualized


I did see another experts exchange post (http://www.experts-exchange.com/Software/Server_Software/Email_Servers/Exchange/Q_28470202.html) that prompted me to look into the same issue and noticed there were several pages of results (in powershell) when running

"(Get-WinEvent -LogName Microsoft-Exchange-ManagedAvailability/* | % {[XML]$_.toXml()}).event.userData.eventXml| ?{$_.ActionID -like "*ForceReboot*"} | ft RequesterName"

against both servers. This led me to a KB article on Microsoft regarding the issue (https://support.microsoft.com/en-us/kb/2969070). I will have to wait until tomorrow around 10:50am central time to see if the problem persists as that is the time each day one of the servers would reboot.

If that does not work, I will try to remove it from the DAG and see what happens.

I will post the results back here.
George SasIT EngineerCommented:
Interesting article regarding the DNS registration. I never had this issue as I always run this in a GPO when I build the domains so I am sure all machines are configured the same.

You can try to disable IPV6 and see what happens, even if "Microsoft recommends that you leave IPv6 enabled, even if you do not have an IPv6-enabled network, either native or tunneled."
Seen some strange problems with network cards configuration , teaming, vmware and IPV6.
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

David BorgNetwork AdministratorAuthor Commented:
I have never had this issue before even when setting the server not to auto register DNS. I had IPV6 unchecked (not fully disabled) but I was receiving another issue in which Exchange was complaining about not being able to find any DNS servers even though it saw it when queried.

There is definitely a strange combination of things going on. You gotta love those bugs that are a 1 in a million and you encounter them.
David BorgNetwork AdministratorAuthor Commented:
Setting the NIC to register DNS was the fix. There has not been a single issue since I set that back.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
David BorgNetwork AdministratorAuthor Commented:
I am selecting this as the solution since everything else failed to fix the problem.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2012

From novice to tech pro — start learning today.