Solved

Outlook stop connecting after putting exchange server behind hardware load balancer.

Posted on 2013-06-25
15
1,082 Views
Last Modified: 2013-07-09
Hi,
we are running exchange 2010 SP2 rollup 6 and hardware load balancer from F5.
we have 1 site which contains 4 exchagnge servers and each is HUB+CAS+MBX. F5 running as load balancer/CAS array which distribute the connections to CAS servers in round robin fashion.
Yesterday, we have an issue where multiple users (not all) start reporting that their outlook keep stuck on "trying to connect" we have check all 4 CAS server and 1 server was not having any active connection with it, we asked our network team to remove that server from load balancing pool and immediately after that everythign start working.
That CAS server was containing few databases which was working mounted and healthy during that time.
After 24 hours of smooth operation we again put that server in POOL and same thing start happening.

Need your expert comments and advice on this.


NOTE: I have'nt explained EDGE as i think its not related to this issue.
0
Comment
Question by:pdixit1977
  • 7
  • 5
15 Comments
 
LVL 3

Expert Comment

by:zackmccracken
ID: 39276534
Hi..

Sorry if the questions or remarks seem a bit basic, but im assuming some things since i do not (and cannot) have all the details.

I think the clients somehow have a direct redirection towards the single server.
Once you put it behind the loadbalancer it seems unavailable to them.
Could you tell me if you are using the loadbalancer's dns name in the clients' outlook profile?
Ie. cas.yourdomain.local ? In stead of yourserver.yourdomain.local?

Second, is the loadbalancer passing through traffic to this server when its behind the loadbalancer ? (is it also configured in the loadbalancer).

Third, I suppose you have tried rebooting computers when the server is put behind the loadbalancer. I have experienced in a smaller environment with 2 exchange 2010 servers in a dag behind a kemp loadbalancer that when we had some issues on one server and switched that we needed reboots of clients. Perhaps this is also the case in your scenario. Moving them behind the loadbalancer might need some reconnecting (by rebooting or logging of and on for the users).. But havent experienced this often.
0
 

Author Comment

by:pdixit1977
ID: 39277886
zackmccracken:
Yes, we are using DNS name ( cas.yourdomain.local ) of our loadbalancer in clients outlook profile.
Second, there is a pool created in load balancer and all of our 4 CAS servers IP mentioned in that pool.

And we are using CISCO load balancer, not F5.
0
 
LVL 3

Expert Comment

by:zackmccracken
ID: 39278074
if you can eliminate the load balancer then...
it seems the clients somehow keep connecting directly to the specified server in stead of the cas dns . ive had some issues with clients (although their profiles had been changed to connect to the cas dns/load balancer) .
after inspection we noticed direct connections from some clients
the problem was a regkey still retaining the hostname of one of the servers
by changing the regkey those clients did not have a problem anymore and connected through the loadbalancer.

The key in question is this one.. could you check subkey values on a client who's had issues ?

HKEY_CURRENT_USER\Software\Microsoft\Windows NT\CurrentVersion\Windows Messaging Subsystem\Profiles\1\13dbb0c8aa05101a9bb000aa002fc45a
0
Are your AD admin tools letting you down?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

 

Author Comment

by:pdixit1977
ID: 39279119
i think we are getting excurse from main issue..

My issue is : 4 CAS servers runnng behind load balancer. 1000s of users start reporting issue that their outlook showing disconnected/trying to connect. we found one CAS with 1-2 active connections, when we remove that server from load balancer everything start working fine. we run this server seperately by putting HOST entry in few affected users outlook (load balancer DNS name to IP address of this CAS server) everything works fine but issue reoccur once we move it back behind load balancer.

There is no suspected logs, events on this server so what we should check becuase something is wrong with this server only as other 3 CAS are working fine behind load balancer and load balancer's health and config already get verified with vendor.
0
 

Author Comment

by:pdixit1977
ID: 39279144
As far as changes are concerned, only 1 change was done in infrastructure which was rollup 6 installation just a day before this issue however other servers are also having same rollup update and running absolutely fine.
0
 
LVL 3

Expert Comment

by:zackmccracken
ID: 39279349
could you look at the number of connections made towards that one server ie. with tcpview (sysinternals) ? this way you can see if the issue is being caused by it being available without the load balancer or the issue is some kind of conflict between the 4 cas servers behind the load balancer and putting the 5th also in the same position.
what im trying to say is .. when the server is at its current place (not behind the load balancer) do the 1000's of connections go towards the 4 cas servers or do the connections go towards that server (not over the load balancer)..

another question.. does the server (not behind loadbalancer) have any mounted critical db's ?
i suppose you also have a dag ?
0
 

Author Comment

by:pdixit1977
ID: 39279837
as of now all connections will go on 4 servers running behind load balancer because all clients looks to mycas.mydomain.local which is the DNS name of load balancer and only those users in which we have made custom host file entry are coming on this server.
Yes, this server is catering databases which are running all the time. No matter it is behind OR infront of load balancer.
0
 
LVL 3

Expert Comment

by:zackmccracken
ID: 39280922
has your network team encountered any errors on the network when you put 'the one' behind the load balancer? (clients or servers trying and failing to connect)
could you give some more information on the load balancer config regarding the cas virtual services?
0
 

Author Comment

by:pdixit1977
ID: 39291710
No, there is no alerts on network stack, they have got confirmation from CISCO.

Our conclusioin till now is, our CAS is somehow not accepting more than 5-10 RPC connection requests. however it works fine with its databases because in that case RPC connections going to other CAS servers and those CAS servers connecting it with SMTP/Other protocals.

As far as changes is concern, only rollup 6 was installed.
0
 
LVL 79

Expert Comment

by:David Johnson, CD, MVP
ID: 39295211
Let me see if I've got things straight

Load Balancer (cas.yourdomain.local  -> round robins -> Exchange1
                                                                                              -> Exchange2
                                                                                              -> Exchange3

This works fine but if
->Exchange1
->Exchange2
->Exchange3
->Exchange4
Breaks everything..

Databases are connected via shared storage pool that all exchange servers access ?

Is this supposition correct
0
 

Author Comment

by:pdixit1977
ID: 39299698
ve3ofa:
your supposition is absolutely correct. I dont know if this is connected to it or not but This is happening since last week just after installation of rollup 6 on all exchange servers.
0
 
LVL 3

Accepted Solution

by:
zackmccracken earned 250 total points
ID: 39300886
pdixit, im at a loss. sorry for not being able to help you any further.
0
 

Author Closing Comment

by:pdixit1977
ID: 39311081
however my issue is not resolved but i appreciate coninued help on this. I raised the same on some other portals but no luck....thanks
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will inform Clients about common and important expectations from the freelancers (Experts) who are looking at your Gig.
One of the biggest threats facing all high-value targets are APT's.  These threats include sophisticated tactics that "often starts with mapping human organization and collecting intelligence on employees, who are nowadays a weaker link than network…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question