Windows Server 2008 DNS - randomly failing to resolve a non-authoritative domain name
Posted on 2011-10-13
I'm encountering an odd problem and hoping for some direction in troubleshooting.
We have 4 name servers on two different networks (lets say, ns1/ns2 on one network, and ns1/ns2 on a different network.)
These name servers are Active Directory servers, and other than one DNS zone in the DNS, the servers act as a caching-only DNS server.
They can all resolve non-authoritative domains just fine (from within our network of course, external lookups are not permitted)
The issue is that we have a reoccurring problem with one specific .org domain name seems to be a problem.
The last time it occurred, it was only one of our 4 name servers that would fail with the following error:
example: (using a fake domain here)
nslookup thedomain.org ns1.nameserver.com
*** UnKnown can't find thedomain.org: Server failed
We couldn't find any cause for this, so we restarted the DNS service on that particular name server and then it started to work. The other 3 name servers were providing a result for this query.
Today, the issue has come up again, and this time it's 3 out of 4 name servers that now cannot resolve this domain.
The one does work properly gives us a result of (example):
I've scanned through the error logs on the primary DNS server but not finding anything to explain this. However the DNS events in the log are mostly just other types of informational logs. There are no warnings/errors/critical alerts at all.
As restarting the DNS service worked last time to resolve the issue, it doesn't seem that it is a configuration problem, otherwise it should't work at all. The other odd issue is that this time 3 of our 4 name servers are failing to provide results for (only) this domain. Any other non-authoritative domain I query on produces results.
Could it be some issue at the authoritative DNS servers? I did run the domain through DNS Stuff and the report showed no issues with its name servers, other than "NS agreement on SOA Serial #", but there were no other errors.
Any guidance in tracking this down would be appreciated.
(Edit - just a quick update - I cleared the cache just now on one of the name servers that couldn't resolve this domain, and that resolved the issue. But why is this occurring / what can we do to prevent this from reoccurring? )