Outlook stop connecting after putting exchange server behind hardware load balancer.

we are running exchange 2010 SP2 rollup 6 and hardware load balancer from F5.
we have 1 site which contains 4 exchagnge servers and each is HUB+CAS+MBX. F5 running as load balancer/CAS array which distribute the connections to CAS servers in round robin fashion.
Yesterday, we have an issue where multiple users (not all) start reporting that their outlook keep stuck on "trying to connect" we have check all 4 CAS server and 1 server was not having any active connection with it, we asked our network team to remove that server from load balancing pool and immediately after that everythign start working.
That CAS server was containing few databases which was working mounted and healthy during that time.
After 24 hours of smooth operation we again put that server in POOL and same thing start happening.

Need your expert comments and advice on this.

NOTE: I have'nt explained EDGE as i think its not related to this issue.
Who is Participating?
zackmccrackenConnect With a Mentor Commented:
pdixit, im at a loss. sorry for not being able to help you any further.

Sorry if the questions or remarks seem a bit basic, but im assuming some things since i do not (and cannot) have all the details.

I think the clients somehow have a direct redirection towards the single server.
Once you put it behind the loadbalancer it seems unavailable to them.
Could you tell me if you are using the loadbalancer's dns name in the clients' outlook profile?
Ie. cas.yourdomain.local ? In stead of yourserver.yourdomain.local?

Second, is the loadbalancer passing through traffic to this server when its behind the loadbalancer ? (is it also configured in the loadbalancer).

Third, I suppose you have tried rebooting computers when the server is put behind the loadbalancer. I have experienced in a smaller environment with 2 exchange 2010 servers in a dag behind a kemp loadbalancer that when we had some issues on one server and switched that we needed reboots of clients. Perhaps this is also the case in your scenario. Moving them behind the loadbalancer might need some reconnecting (by rebooting or logging of and on for the users).. But havent experienced this often.
pdixit1977Author Commented:
Yes, we are using DNS name ( cas.yourdomain.local ) of our loadbalancer in clients outlook profile.
Second, there is a pool created in load balancer and all of our 4 CAS servers IP mentioned in that pool.

And we are using CISCO load balancer, not F5.
Worried about phishing attacks?

90% of attacks start with a phish. It’s critical that IT admins and MSSPs have the right security in place to protect their end users from these phishing attacks. Check out our latest feature brief for tips and tricks to keep your employees off a hackers line!

if you can eliminate the load balancer then...
it seems the clients somehow keep connecting directly to the specified server in stead of the cas dns . ive had some issues with clients (although their profiles had been changed to connect to the cas dns/load balancer) .
after inspection we noticed direct connections from some clients
the problem was a regkey still retaining the hostname of one of the servers
by changing the regkey those clients did not have a problem anymore and connected through the loadbalancer.

The key in question is this one.. could you check subkey values on a client who's had issues ?

HKEY_CURRENT_USER\Software\Microsoft\Windows NT\CurrentVersion\Windows Messaging Subsystem\Profiles\1\13dbb0c8aa05101a9bb000aa002fc45a
pdixit1977Author Commented:
i think we are getting excurse from main issue..

My issue is : 4 CAS servers runnng behind load balancer. 1000s of users start reporting issue that their outlook showing disconnected/trying to connect. we found one CAS with 1-2 active connections, when we remove that server from load balancer everything start working fine. we run this server seperately by putting HOST entry in few affected users outlook (load balancer DNS name to IP address of this CAS server) everything works fine but issue reoccur once we move it back behind load balancer.

There is no suspected logs, events on this server so what we should check becuase something is wrong with this server only as other 3 CAS are working fine behind load balancer and load balancer's health and config already get verified with vendor.
pdixit1977Author Commented:
As far as changes are concerned, only 1 change was done in infrastructure which was rollup 6 installation just a day before this issue however other servers are also having same rollup update and running absolutely fine.
could you look at the number of connections made towards that one server ie. with tcpview (sysinternals) ? this way you can see if the issue is being caused by it being available without the load balancer or the issue is some kind of conflict between the 4 cas servers behind the load balancer and putting the 5th also in the same position.
what im trying to say is .. when the server is at its current place (not behind the load balancer) do the 1000's of connections go towards the 4 cas servers or do the connections go towards that server (not over the load balancer)..

another question.. does the server (not behind loadbalancer) have any mounted critical db's ?
i suppose you also have a dag ?
pdixit1977Author Commented:
as of now all connections will go on 4 servers running behind load balancer because all clients looks to mycas.mydomain.local which is the DNS name of load balancer and only those users in which we have made custom host file entry are coming on this server.
Yes, this server is catering databases which are running all the time. No matter it is behind OR infront of load balancer.
has your network team encountered any errors on the network when you put 'the one' behind the load balancer? (clients or servers trying and failing to connect)
could you give some more information on the load balancer config regarding the cas virtual services?
pdixit1977Author Commented:
No, there is no alerts on network stack, they have got confirmation from CISCO.

Our conclusioin till now is, our CAS is somehow not accepting more than 5-10 RPC connection requests. however it works fine with its databases because in that case RPC connections going to other CAS servers and those CAS servers connecting it with SMTP/Other protocals.

As far as changes is concern, only rollup 6 was installed.
David Johnson, CD, MVPOwnerCommented:
Let me see if I've got things straight

Load Balancer (cas.yourdomain.local  -> round robins -> Exchange1
                                                                                              -> Exchange2
                                                                                              -> Exchange3

This works fine but if
Breaks everything..

Databases are connected via shared storage pool that all exchange servers access ?

Is this supposition correct
pdixit1977Author Commented:
your supposition is absolutely correct. I dont know if this is connected to it or not but This is happening since last week just after installation of rollup 6 on all exchange servers.
pdixit1977Author Commented:
however my issue is not resolved but i appreciate coninued help on this. I raised the same on some other portals but no luck....thanks
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.