Exchange servers won't communicatie via RPC

Hi there,

Since a couple of days two of our Exchange servers have stopped communicating with each other. We have three locations in this situation; a main office and two branch offices.
The main office has a Windows Server 2012 domain controller (DC1) and a Windows 2012 /w Exchange 2016 server (EX1).
Terrible Visio drawing of the network(please excuse the horrible drawing)
The first branch office has a server that is DC and Exchange 2013 (B1) and the second branch is identical to the first (B2). Note that there is no connection between B1 and B2
Communication between B1 and main office DC & EX1 is fine. B2 however has the issue that it's server won't communicate properly with EX1. . It's possible to ping the device but "net use" and "net view" are giving us RPC errors in both directions:

>net use \\ex1

System error 64 has occurred.
The specified network name is no longer available.

net view \\ex1

System error 53 has occurred.
The network path was not found.

The same tests from the B2 to DC1 are fine, however.

Earlier in the day there were also problems communicating with DC1, we fixed this by doing a "netdom /resetpwd" on B2. But this hasn't helped with EX1.
Thinking the problem might be on EX1, we also did the same "netdom /resetpwd" on that server but that didn't change anything. "Netdom verify B2 /d:<domain>" displays that everything should be working correctly

>netdom verify B2 /d:<domain>.local
The secure channel from B2 to the domain <domain>.LOCAL has been verified.  The connection is with the machine \\DC1.<domain>.LOCAL.

Each branch server points to themselves as DNS server and using nslookup all relative queries seem to be fine.

There's not many events in the event viewer that seem relative to the situation or are helping us in any way so far.

What could be going wrong in here and what can we do to fix this?

Thanks very much in advance,
Kris
Kris CoadyIT SpecialistAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

systechadminConsultantCommented:
Kindly check the DNS records and NIC card settings of B2 exchange.
0
Kris CoadyIT SpecialistAuthor Commented:
Thanks for the reply,
DNS records on all three servers are identical. NSlookup also presents identical results on all servers when looking up EX1 (172.16.10.1) or B2 (10.152.0.7) from each location.  As mentioned, pinging devices also works fine from all locations. We've tried to have B2 to point at DC1 as a DNS server (so that both B2 and EX1 are using the same DNS) without any change.
We've made sure the "Register this connection's addresses in the DNS" checkbox is enabled for all NIC's in the network.
0
systechadminConsultantCommented:
check the group policy setting for below

Network Security: Restrict NTLM: Outgoing NTLM traffic to remote servers

and also check if lanman server and browser services are running on Exchange server.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Redefine Your Security with AI & Machine Learning

The implications of AI and machine learning in cyber security are massive and constantly growing, creating both efficiencies and new challenges across the board. Check out our on-demand webinar to learn more about how AI can help your organization!

Kris CoadyIT SpecialistAuthor Commented:
The lanman (server) service is enabled.
Computer Browser service was disabled for some reason, we've enabled this now. But no change in the connections so far.
Another strange thing did just occur. We rebooted the B2, after which at least one email from the queue on B2 popped through to an external recipient. But after that communication stopped again. Net view/use is still not possible between EX1 and B2.

Exchange is posting some actual errors since the reboot now. On EX1 the errors are:
441 4.4.1 Error communication with target host: "Failed to connect. Winsock error code 10061, Win32 error code 10061."
and
441 4.4.1 Connection dropped due to ConnectionReset

On B2 the error is
441.4.4.1 Error encountered while communicating with primary tager IP address: "421 4.4.2 Connection dropped due to TimedOut." Attempted failover to alternate host, but that did not succeed.

PS. We're able to open a Telnet on 25 and 2525 from B2 to EX1 without any trouble
0
Kris CoadyIT SpecialistAuthor Commented:
We're also seeing the following errors in the Exchange connection logfiles on the B2 server:
2018-02-21T09:19:01.490Z,08D57908FE73C606,SMTP,main-office,+,"SmtpRelayToRemoteAdSite 30c92e52-37ee-426e-8392-48496d8efbec;QueueLength=TQ=7;RN=1,RL=6;. "
2018-02-21T09:19:01.493Z,08D57908FE73C606,SMTPmain-office,>,EX1.<domain>.local[10.152.0.7]
2018-02-21T09:19:02.439Z,08D57908FE73C606,SMTP,main-office,>,Established connection to 10.152.0.7
2018-02-21T09:20:38.552Z,08D57908FE73C606,SMTP,main-office,-,Messages: 0 Bytes: 0 (Retry : Connection dropped due to TimedOut)
But I still believe the problem is related more to DNS/Directory Services and that when that is fixed Exchange will start functioning normally too.
0
MaheshArchitectCommented:
RPC errors are offen happens either because of failed name resolution or network port blockages

make sure that you enable any-any network communication between site B2 DC/Exchange and mail office ADF/Exchange and then check how it goes.

Once you open all network ports, reboot B@ location both servers and check
0
Kris CoadyIT SpecialistAuthor Commented:
When pinging or using nslookup on the hostnames we're getting the correct IP's back. So we believe that name resolution is working correctly.
Also telnet between B2 and EX1 on port 135 (RPC) opens perfectly. There are no firewalls between the servers.

We did just notice that it isn't possible to open a Telnet connection on port 25 (smtp) from main office to B2. This is possible to B1 (which is working correctly). Port 2525, however, is working from main office to B2.
When opening a Telnet on port 25 to localhost on B2, it is working.
0
MaheshArchitectCommented:
if no firewall exists between servers, then how come telnet is blocking on port 25?

check on exchange server at B2 if all exchange services are running?
also check if self telnet on tcp 25 is working on B2 exchange server?
0
Kris CoadyIT SpecialistAuthor Commented:
All Exchange services on B2 are running fine (we've tried restarting transport services).
Opening Telnet on 25 from self and other servers at the same branche office works fine. It just isn't working from main office DC1 or EX1. But not sure this is related to the fact that "net view" also isn't working from EX1 to B2 and RPC is accessible.
Exchange Services on B2
PS. I double checked the firewalls and anti-virus software, this is all disabled on both locations.
0
MaheshArchitectCommented:
in that case somewhere tcp 25 is getting blocked, r you sure that no firewall exists in between
also check av software on b2 exchange

can you try telnet from ex1 to b2 exchange server IP on tcp 25?
0
Kris CoadyIT SpecialistAuthor Commented:
100% sure. We've manually disabled both the Windows firewalls and Symantec endpoint protection software to make sure.
We also just did a reboot of the Cisco 2900 router and allowed it to recreate the IPSec connection, at the B2 location to make sure nothing was being blocked from that side. No difference again.

Both DC1 and EX1 can't access port 25 @ B2.
All local devices at B2 can access the port, which is very strange.
0
Kris CoadyIT SpecialistAuthor Commented:
Status update: Approximately 30 mins ago the Exchange servers started communicating with each other again flawlessly. We're not sure what changed as we weren't changing anything at the time that it started to function again.
If we find the cause or something else changes I'll update this post.

PS. It's still not possible to Telnet from main office to B2 on port 25 but I believe port 2525 is more important for the communication between Exchange servers.
0
MaheshArchitectCommented:
2016 Mail server must be able to communicate on TCP 25 as there is no seperate hub transport server role available
0
Kris CoadyIT SpecialistAuthor Commented:
Cause found: Whenever the WiFi at the branch office get's switched on, the problems with synchronization immediately  start again. Switch if off and everything fixes itself again within no time.

The interesting thing is that the WiFi is physically completely separated from the servers' network but do share the same (poor) internet connection.

Thanks for all the help and suggestions.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Exchange

From novice to tech pro — start learning today.