Windows Server unable to resolve external DNS values until a reboot.

Windows Server unable to resolve external DNS values until a reboot.
- Once rebooted it will resolve and act normal for several days; then stop resolving external values again.  Can get into the server via RDP from another machine locally and perform the following troubleshooting methods:

Reviewed the event logs for Critical, Errors, and Warnings.  No critical items; Errors related to DNS as follow:

- DNS Server Event ID: 4015 "The DNS server has encountered a critical error from the Active Directory. Check that the Active Directory is functioning properly. The extended error debug information (which may be empty) is "". The event data contains the error."
  * Have reviewed the DNS settings:
    *  Domain controller points to the PDC then to itself to resolve DNS.
    *  Perform a ping to "" - RESPONSE: Ping request could not find host Please check the name and try again.  |  ping the IP: get a response no problem, the IP is resolved by another machine that got the IP from a response to

- Run from an elevated prompt: dcdiag /test:DNS /e /v >LOGFILE  the results are as follow on the effected Server DC01
  Domain          Auth   Basc   Forw   Del      Dyn     RReg   Ext
  DC00              PASS   WARN FAIL   FAIL     PASS   PASS   n/a
  DC01              PASS   WARN FAIL   FAIL     PASS   PASS   n/a

- Run the same from the server DC00 get the following results:
  Domain          Auth   Basc   Forw   Del      Dyn     RReg   Ext
  DC00              PASS   WARN PASS  FAIL     PASS   PASS   n/a
  DC01              PASS   WARN PASS  FAIL     PASS   PASS   n/a

After a reboot of the server DC01 the results of the dcdiag /test:DNS is as follow:
  Domain          Auth   Basc   Forw   Del      Dyn     RReg   Ext
  DC00              PASS   PASS  PASS   PASS   WARN PASS   n/a
  DC01              PASS   PASS  PASS   PASS   WARN FAIL   n/a

On the server DC00 the results are as follow:
  Domain          Auth   Basc   Forw   Del      Dyn     RReg   Ext
  DC00              PASS   PASS  PASS   PASS   PASS   PASS   n/a
  DC01              PASS   PASS  PASS   PASS   PASS   FAIL    n/a

Checked both servers to assure the DNS properties are identical:
- NIC's point to the other server as the Primary and itself as the secondary.
- Zone properties for all zones are Secure only for Dynamic updates
  Type: Active Directory-Integrated
  Replication: All DNS servers in this domain
  Name Servers: are just the single IP of each server.

If I wait a few days the DC01 server will stop resolving external names again; both servers are Windows 2012 R2, DC00 is physical and the DC01 is Virtual.  Systems are up to date with NIC drivers and Windows updates.  This has been going on for sometime, one of the first resolutions was the issue with Active Directory Replication, that issue has been resolved but the DNS external resolution problem has been lingering.  Any thoughts I might have missed explaining above?
Brian ShoemakerDirector of Operations / IT ServicesAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Are the servers up to date and are you using forwarders?
I would suggest running a packet capture on DC01 when it is working and then again once it starts failing.  This will allow you to see where it is sending the resolution requests.

Since it seems they both point to each other for DSN servers you can also run the command:

nslookup x.x.x.x

Change x.x.x.x to:
     "the IP address of DC00"
     "the IP address of DC01"

And see which one(s) work and which one(s) fail.  The is Google public DNS resolver.  If this works, this shows you are getting out of your network and to the Internet.  If this fails, that means something is blocking you from getting to the Internet, which should not be the case as you have already shown you can ping outside of your network.
Brian ShoemakerDirector of Operations / IT ServicesAuthor Commented:
Additional Notes:
- Forwarders are in place for the ISP's DNS Servers
From an elevated prompt:
- nslookup
DNS request timed out.
         timeout was 2 seconds.
DNS request timed out.
         timeout was 2 seconds.
DNS request timed out.
         timeout was 2 seconds.
DNS request timed out.
         timeout was 2 seconds.
*** Request to unKnown timed-out

Still able to ping

****** After Reboot ******
DNS request time out.
         timeout was 2 seconds.
Server:   UnKnown

Non-authoritative answer:
Addresses:   2607:f8b0:4005:80a::2004

Server:    localhost
Address:  ::1

Non-authoritative answer:
Addresses:   2607:f8b0:4005:80a::2004

This is a fun one!!!
Starting with Angular 5

Learn the essential features and functions of the popular JavaScript framework for building mobile, desktop and web applications.

Timing out on this "nslookup" but being able to ping implies that something is blocking udp port 53 traffic.

Not sure what type of corporate firewall you have, but if possible you want to enable logging to see if the request is getting to the firewall.

If it is not, then you need to start picking various points within your network where you can to packet captures.  You may want to run one on the Windows server that is having the problem to see if it appears if it is leaving that host.
Brian ShoemakerDirector of Operations / IT ServicesAuthor Commented:
User: giltjr
- The firewall is not blocking port 53; once the server reboots all things resume normally. I have found an issue with RPC but the restart feature of the RPC services is greyed out.
DrDave242Senior Support EngineerCommented:
Have you tried using root hints or different forwarders to see if you get the same result?
Brian ShoemakerDirector of Operations / IT ServicesAuthor Commented:
DrDave242: validated root hints from and changed forwards to Googles; and still no change; in 24 to 72 hours a reboot is required of the effected server to allow for continued external name resolution then the clock starts all over again.
I agree that it most likely is not corporate firewall, but it is possible if it has IPS and for some reason had determined that it need to block for awhile.

It can also still be something local to the server.  By doing packet captures hopefully you can isolate where the traffic is stopping.

What problem did you find with RPC?
DrDave242Senior Support EngineerCommented:
I'm with giltjr; a packet capture on the affected server while the issue is happening may be quite useful. At the very least, it'll show whether the server is sending out DNS requests that are being blocked somewhere or not sending them out at all for some reason.
Brian ShoemakerDirector of Operations / IT ServicesAuthor Commented:
This is such a random related issue which causes long delay's between being able to review the issues... Finally had the issue come backup again; This time: Warning DFSR Event ID: 5014;
§ The DFS Replication service is stopping communication with partner %SERVERNAME% for replication group Domain System Volume due to an error. The service will retry the connection periodically.
      □ Additional Information: Error 1723 (The RPC Server is too busy to complete this operation) 04/08/2019 8:44:01
Also: ○ Warning; NETLOGON ID 5773
§ The following DNS server that is authoritative for the DNS domain controller locator records of this domain controller does not support dynamic DNS updates   | DNS server IP address:   / Returned Response Code (RCODE): 4  | Returned Status Code 9004

Performed the following: Validated the following was in place: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DNS\Parameters\AllowUpdate; this did not have the value added and made REG_DWORD value = 1

Last issue related to the errors on the server; DNS-Server -Service Event ID: 140
- The DNS server could not initialize the remote procedure call (RPC) service. If it is not running, start the RPC service or reboot the computer. The event data is error code.
  * RPC Service is running.

Have now installed MIcrosoft Message Analyzer; will do a packet inspection once it does it again.  Please not this has sometime taken up to 2 Weeks to stem back.
- No changes are done to the server.
Based on those last set of message it looks like the DNS issue maybe a symptom, not the problem.  If you have not, you will want to look at the event jobs on the partner server whose name appears in the message.

It possible it could be having a performance problem and can't respond fast enough or there is a network issue that is causing a delay in the response to RPC calls to the partner.

If you search on "Event ID 5014, Error: 1726" a couple of the posts seem to point to a network issue and most of them talk about changing some TCP options.  Although it is not clear if this should be done on the partner or the server getting the error.  Maybe it needs to be done on both.
Brian ShoemakerDirector of Operations / IT ServicesAuthor Commented:
No solution found; bandage is to schedule reboot server at 4am everyday...

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows OS

From novice to tech pro — start learning today.