Avatar of jman0 war
jman0 war

asked on 

Exch 2013 - DAG issue, one of 2 members always failsover

When I activate the DB copy on one of 2 members of a DAG, the next day it's failed over.
Looking at Event Viewer HighAvailablity\Operational log it shows:

Moving all active databases failed for server 's922-exch-mb1.xxx.com (MoveComment: Managed availability system failover initiated by Responder=OutlookMapiHttpDeepTestFailover Component=Outlook., Error: Some (1) active databases could not be successfully moved.).

Get-MailboxDatabaseCopyStatus shows Status as Healthy

Since this problem was noticed I've installed security updates and restarted it.
Actually i've restarted the problem server a couple times now, but it's the same.

Probably out of depth on this.
Hoping someone here can help.


When i run  Get-ServerHealth -Identity "s922-exch-mb1.xxx.com" -HealthSet "Outlook.Protocol" |ft server
,state,name,alertvalue –Autosize

Server                    state Name                                              AlertValue
------                    ----- ----                                              ----------
s922-exch-mb1.xxx.com       OutlookRpcDeepTestMonitor                            Healthy
s922-exch-mb1.xxx.com      OutlookMapiHttpSelfTestMonitor                     Unhealthy
s922-exch-mb1.xxx.com        OutlookRpcSelfTestMonitor                            Healthy
s922-exch-mb1.xxx.com        OutlookMapiHttpDeepTestMonitor                       Healthy
s922-exch-mb1.xxx.com        PrivateWorkingSetWarning....cclientaccess.service    Healthy
s922-exch-mb1.xxx.com        PrivateWorkingSetError....rpcclientaccess.service    Healthy
s922-exch-mb1.xxx.com        ProcessProcessorTimeWarning....ientaccess.service    Healthy
s922-exch-mb1.xxx.com        ProcessProcessorTimeError....clientaccess.service    Healthy
s922-exch-mb1.xxx.com        ExchangeCrashEventError....pcclientaccess.service    Healthy
s922-exch-mb1.xxx.com       LongRunningWatsonWarning....cclientaccess.service    Healthy
s922-exch-mb1.xxx.com        LongRunningWerMgrWarning....cclientaccess.service    Healthy
Exchange

Avatar of undefined
Last Comment
jman0 war
Avatar of Jeff Glover
Jeff Glover
Flag of United States of America image

I assume both servers are running all roles. Run Get-MAPIVirtualDirectory | fl on each one and compare. See if something pops out. As long as you setup MAPI over HTTPS correctly and set the URIs, it should work.  To check the MAPI service, go to https://(your URL)/MAPI/Healthcheck.htm. You should get a 200 OK page.

You might also try the Managed Availability Troubleshooter. You can get it here
https://gallery.technet.microsoft.com/MATS-bc0d200d
Avatar of jman0 war
jman0 war

ASKER

Thanks for replying.

On the problem server i also couldn't pull up the EAC.
I ended up having to type     https:// s922-exch-cas/ecp/?ExchClientVer=15

I also found another article that seemed to match:
https://social.technet.microsoft.com/Forums/exchange/en-US/7b1ebadd-cb10-48fc-b3f3-0c7a449183c4/exchange-2013-cu3-databases-only-activate-on-one-mailbox-server?forum=exchangesvrgeneral

I have now disabled the responder in that thread.
I was going to activate the DB on the problem server tonight and see what happens.


I tried    https://s922-exch-mb1/mapi/healthcheck.htm but get a 404 not found error
Avatar of jman0 war
jman0 war

ASKER

the 404 error shows:
Physical Path   C:\inetpub\wwwroot\mapi\healthcheck.htm

But there is no mapi folder at that location.
Avatar of Jeff Glover
Jeff Glover
Flag of United States of America image

look in the IIS manager for that server and make sure the MAPI IIS virtual directory exists in the Web Front End website.
Avatar of jman0 war
jman0 war

ASKER

It works on the good server  s922-exch-mb2
I get the "200 OK"
Avatar of jman0 war
jman0 war

ASKER

in IIS Manager under Sites I see:

Default Web Site
Exchange Back End

I don't know what or where i'm supposed to see Web Front End
Avatar of Jeff Glover
Jeff Glover
Flag of United States of America image

The path should be %Program Files%\Microsoft\Exchange Server\V15\FrontEnd\HttpProxy\mapi to see the directory. In IIS manager, check the MAPI VDir settings for the path.  What CU are you running?
Avatar of jman0 war
jman0 war

ASKER

on the good server, there is a FrontEnd\HttpProxy\Mapi directory.
On the problem sever, there is no FrontEnd directory.

Version 15 Build 775.38
Avatar of Jeff Glover
Jeff Glover
Flag of United States of America image

Same build on both? You are only running CU3 and MAPI/HTTPS should not even be there until CU4 or better. It sounds like you have one at CU3 and another at a later CU build.
Avatar of jman0 war
jman0 war

ASKER

The good MB2 server is also on Version 15 Build 775.38

CAS server is also on the same build.

then there's an Edge Transport server that's Version 14.3 Build 123.4
Avatar of Jeff Glover
Jeff Glover
Flag of United States of America image

hmmm. When you run get-organizationconfig | fl what do you see for mapihttpenabled? Or do you even see it?
Avatar of Amit
Amit
Flag of India image

@Joseph

For DAG issues, you need to focus on cluster logs. Open failover snap-in and check the cluster logs. Post it here.
Avatar of jman0 war
jman0 war

ASKER

I don't see "mapihttpenabled"
Avatar of Jeff Glover
Jeff Glover
Flag of United States of America image

Then I have no idea why that Health monitor is firing. You could try upgrading to at least CU4 or better, both servers, and see if it helps the issue. Current CU for exchange is 10.
Avatar of jman0 war
jman0 war

ASKER

Amit, can you give me more specifics about what Cluster Logs?
I can open Failover Cluster Manager.

I go to Nodes and then this server.
Critical Events show:

ID 1135
Cluster node 'S922-EXCH-MB1' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

ID 1561
The cluster service has determined that this node does not have the latest copy of cluster configuration data. Therefore, the cluster service has prevented itself from starting on this node.
Try starting the cluster service on all nodes in the cluster. If the cluster service can be started on other nodes with the latest copy of the cluster configuration data, this node will be able to subsequently join the started cluster successfully.

If there are no nodes available with the latest copy of the cluster configuration data, please consult the documentation for 'Force Cluster Start' in the failover cluster manager snapin, or the 'forcequorum' startup option. Note that this action of forcing quorum should be considered a last resort, since some cluster configuration changes may well be lost.

ID 1177
The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
ASKER CERTIFIED SOLUTION
Avatar of Amit
Amit
Flag of India image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of jman0 war
jman0 war

ASKER

VM's
Avatar of Amit
Amit
Flag of India image

Are these on same ESX or different?
Avatar of jman0 war
jman0 war

ASKER

Im not sure but i think it's the same ESX.
Avatar of Amit
Amit
Flag of India image

OK you basically need to focus on network issue. During issue next time. Run cluster validation and it will show you the issue and will suggest what you need to do.
Avatar of Jeff Glover
Jeff Glover
Flag of United States of America image

Since they are on Virtuals, have you had any vMotion incidents? And, sorry for earlier. I did not realize you had a separate CAS server. If you only have Maibox role installed, you will not have the MAPI stuff.
Avatar of jman0 war
jman0 war

ASKER

sorry but I am now informed that they are not on the same ESX.
They are in different datacenters.

I'll try the Validation and do the Network bit later.

thanks for the help so far.
Avatar of jman0 war
jman0 war

ASKER

Ok i ended up contacting the VM guys that would have setup this server.
They did some pieces of work for their backup solution and found some additional issues.
I'm not sure of the details.

But I do know that they did not run the Cluster Validation Manager tool.

I activated the DB copy on MB1 and so far it's persisted.
(less than 24 hours)

So maybe the VM guys fixed something, or maybe my turning off the Outlook Responder monitor did.
Avatar of Amit
Amit
Flag of India image

From your details above, it is a network issue. You might need to ask them what cause this issue or what they did to fix it.
Avatar of jman0 war
jman0 war

ASKER

They replied  that they fixed something with a backup solution : Avamar. They said the install wizard goes through and establishes items under the Exchange Failover Cluster Manager. Under Roles it has the 'Avamar Backup Client Role', which is assigned an IP address strictly for Avamar processes.

It seems this was not acting correct so it was tore down and rebuilt using the wizard then they worked with EMC support.
Exchange
Exchange

Exchange is the server side of a collaborative application product that is part of the Microsoft Server infrastructure. Exchange's major features include email, calendaring, contacts and tasks, support for mobile and web-based access to information, and support for data storage.

213K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo