• Status: Solved
  • Priority: Low
  • Security: Public
  • Views: 45
  • Last Modified:

Exchange 2013 - Cluster/Failover not working properly

Hi all,
First question here :) And an odd one!
Hope one of you guys saw this issue before.

I have an exchange (2013) DAG, with 3 servers (2012 R2). As far as I can tell, everything is setup fine (well, it has been for a few years now!) but very recently (about a month), when one of the server goes down (let's say a reboot, or even an IIS restart!) all the Outlook clients disconnect! Even if the server does not have live mailboxes.

This only happens for one server. When I reboot any of the other 2, the failover happens as usual if the client database is hosted on that particular server.

So, that one is a first for me, and as far as I can tell, everything is setup as it should be, and config does match the other 2 servers.

Any pointers?

Many thanks,

Fred Marie
Fred Marie
  • 4
  • 3
1 Solution
all the Outlook clients disconnect! Even if the server does not have live mailboxes

if server have passive copies of databases and if that only server goes down, there should not be an impact

You need to check if server has active copy got activated due to automated mailbox database activation etc

Also if this is CAS server, it might possible that your clients are connected to this server and got disconnected after server rebooted as CAS is responsible for proxying connections to mailbox server, the disconnection would be time being for very short period in that case
Fred MarieAuthor Commented:
Hi Mahesh,

Thanks for your inputs.
I can confirm that this server does not hold active database.
I do agree, usually the disconnection last for about 5 seconds max in our environment. But in this particular case, the clients disconnects for as long as the server goes down.
Either by a server restart, or an IIS reset/restart.

This does not make any sense to me I have to say! :(
does you have HA established for CAS servers? and what kind of HA is configured? DNS round robin or HLB or NLB which one?

If NLB or HLB is configured, you should not face issue, however if round robin is configured, then you might face issue if TTL value for CAS server record is more, in that case reconnection is not possible until server come online

However to isolate the issue, on client machine from outlook client taskbar icon, right click while holding CTRL key and go to connection status. There you would find the CAS server, client is connected to. if this server is different from failed server, you can reset IIS / reboot failed server again and check if still client is getting impacted, it should not.
Has Powershell sent you back into the Stone Age?

If managing Active Directory using Windows Powershell┬« is making you feel like you stepped back in time, you are not alone.  For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why.

Fred MarieAuthor Commented:
Hi Maresh,

We are using DNS+HLB(kemp), but good point, haven't checked that, but I have a look now, although, no changes made that I am aware of.

For your second part, whichever server my DB is live on, it does not matter, as soon as I put this particular box offline, all clients disconnects until this one comes back. If id do that to the other 2 servers in the cluster, failover occurs just fine. In the config window, when I hit reconnect while the server is offline, it does not, like when this server is offile, Outlook lose all connectivity to the entire cluster... very odd!

I did a failover cluster validation, and all came back clear minus the "usual" issues, but nothing odd from my historical data.

I am lost on that one I have to say!
can you let me know what role each of your servers hosting including DAG and number of database copies
also what about autodiscover, mail.domain.com record, it is pointing to which servers
Also I don,t understood how (KEMP + DNS), how both are in place
Fred MarieAuthor Commented:
All 3 servers are:
ServerRole                      : Mailbox, ClientAccess

Each with 12 DB copies, all healthy.

DAG has 3 nodes (server 1, 2 and 3), and currently hosted on server#3. Issue is with server#2

I check the DNS, but I believe mail.domain.com point to the virtual IP in Kemp and autodiscover to all 3 IPs, but have to confirm, this is on top of my head.

For Kemp/DNS, this was done before my time here, but I believe HLD+DNS is the recommended setup from MS, but don't mark my word! :)
autodiscover also should point to HLD vip,
what is your autodiscover internal uri ?
if it's pointing to autodiscover itself, then it is problem because you have autodiscover pointing to all 3 servers instead of hlb vip
autodiscover internal uri should point to mail.domain.com which is pointed to hlb vip,
both autodiscover and mail.domain.com should point to hlb vip, after that it's load balancer responsibility to immediately rotate connections to other server if 1st fails
Seth SimmonsSr. Systems AdministratorCommented:
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I have recommended this question be closed as follows:

Accept: Mahesh (https:#a42423438)

If you feel this question should be closed differently, post an objection and the moderators will review all objections and close it as they feel fit. If no one objects, this question will be closed automatically the way described above.

Experts-Exchange Cleanup Volunteer
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now