Solved

Root dns problem

Posted on 2009-05-05
6
737 Views
Last Modified: 2012-05-06
We're having a strange problem with DNS.  On the user side, it appears as a temporary inability to get anywhere on the web.  Even to an internal website.  IE/Firefox just hang up; a page refresh doesn't work; reloading the app often does, as does waiting 5-10 minutes and trying again.  

On the server side, I'm seeing event 4521 every 3 minutes, with the detail:
"The DNS server encountered error 9002 attempting to load zone . from Active Directory. The DNS server will attempt to load this zone again on the next timeout cycle. This can be caused by high Active Directory load and may be a transient condition."

This is a Windows 2003 Small Business Server, SP2.  Running our own DNS server internally, with the server pointed at itself (via it's own IP address, as recommended for 2003) and no secondary DNS server listed.  The DNS server is configured with forwarders (we use OpenDNS to limit non-work activities).  

I've already been to eventid.net and tried the various suggestions there.  I'm unable to create a '.' zone, an attempt to do that creates an error about zone creation.  There is no '.' zone already in evidence.  I've tried the sequence in KB articles M298148, M323380 regarding removing the '.' zone, with no results.  I've even gone through the suggestion in KB M294328 on how to reinstall a dynamic DNS Active Directory Zone to rebuild our DNS server entirely, with no change.

I know there was another server in this domain at some point; it had Exchange on it and when I took over I had to (carefully) remove evidence of it from the Active Directory, because the prior sysadmin just ripped it physically out without a graceful demotion and removal.  I'm guessing something similar happened to the dns, since the problem was recreated as soon as I got the DNS service rebuilt.

Oh, and just for kicks, I tried configuring the DNS server without forwarders, just to check; no luck, same errors and sporadic failures on the user side.  I have one user who is pointed at another, external, DNS server; he has none of the sporadic failures.

Any suggestions gratefully received; I'm really tearing my hair out on this one.
0
Comment
Question by:qcsboise
  • 2
  • 2
  • 2
6 Comments
 
LVL 5

Accepted Solution

by:
Member_2_4708244 earned 500 total points
ID: 24312437
Have you tried running dcdiag. Its part of the suppor tools for server 2003, so you will need to download and install that from microsoft (its free).

Then run dcdiag from the command line "dcdiag /fix /v >>c:\dcdiag.txt"

Then review the txt file for any errors and fix them as needed.
0
 
LVL 38

Expert Comment

by:ChiefIT
ID: 24314120
The inability to contact the intranet and/or internet is the client's inability to contact the server. It may be trying to find the old server that no longer exists. My first guess would be, what preferred DNS servers are being passed down to the clients. This is done through DHCP......

DHCP passes down the preferred DNS servers to the clients. So, one of two things could be happening. You may have a rogue DHCP server, (like a router or mass storage device), that is spitting out a bad internal DNS server address to the clients. If a rogue DHCP server is sending out the preferred servers as an outside server, you may not get domain services internally, but you should get external DNS to the internet. The second option is your Server as a DHCP server. Under the DHCP snaping>>scope options>> you may have listed as a preferred DNS server an old server that no longer exists. So, your client is trying to periodically contact that server that no longer exists and can't find it. The client may time out on its DNS query, you may find that the client can't contact any domain server or other client, and you will periodically loose the interent.

Other than that, you could check your DNS root. Under the DNS snapin, do you see any folders greyed out?
0
 
LVL 38

Expert Comment

by:ChiefIT
ID: 24314148
By the way, if you have a rogue DHCP server, you will want to prevent it from providing DHCP and let your server handle that task. Otherwise, your router or mass storage device that is providing DHCP will also provide DNS. The problem with that is, the rogue device will not hold the DNS SeRVice (SRV) records of the domain controller. So, that knocks down domain services.
0
Free eBook: Backup on AWS

Everything you need to know about backup and disaster recovery with AWS, for FREE!

 

Author Comment

by:qcsboise
ID: 24333063
Dinga, that was great advice.  The dcdiag highlighted another error of which I'd been unaware, an 1801 error from the Knowledge Consistency Checker.  Armed with the 4521 AND the 1801 errors, along with proposed solutions courtesy of EventID.net, I was able to resolve the issue and stop the events.  Not completely sure it's done yet; we'll test further with the office tomorrow, but things look great right now.
0
 
LVL 5

Expert Comment

by:Member_2_4708244
ID: 24334023
Both those tools are invaluable resources for troubleshooting.

Let me know if its solved the browsing issue.
0
 

Author Comment

by:qcsboise
ID: 24424954
It didn't solve the browsing issue, but that appears related to server activity levels in addition to the dns issues.  It did completely solve the dns errors and for that I Thank You!.

-Matthew
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How do I get info about users, forwarders and their mailboxes on SBS 2011 1 38
Domain forwarding 4 30
Public DNS  Vs BGP 20 55
Exchange Cross-Forest Migation 6 31
Because virtualization becomes more and more common, and, with Microsoft Hyper-V included in Windows Server at no additional costs, and, most server hardware nowadays is more than capable of running a physical Small Business Server (SBS) 2008 or 201…
I’m often asked about newer and larger USB drives connected to SBS2008 and 2011 failing Windows Server Backup vs the older USB drives not failing. As disk space continues to grow and drive technology change SBS2008 and some SBS2011 end up with the f…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
In an interesting question (https://www.experts-exchange.com/questions/29008360/) here at Experts Exchange, a member asked how to split a single image into multiple images. The primary usage for this is to place many photographs on a flatbed scanner…

685 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question