Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


DNS Issues

Posted on 2014-02-26
Medium Priority
Last Modified: 2014-02-28
Hi.. I have an issue that is puzzling.. I manage a network of 250+ computers that 80% of the computers are wireless.  Recently, I have been having issues with computers not being able to connect to the internet.. stating that DNS is the issue.  I never seem to have a problem connecting to the wireless network and can connect to different computers internally, but the internet is the problem.  The strange thing is, that when I "reboot" the servers and restart the switches (with limited clients logging in) everything seems to operate without a problem.  When more computers are powered on and login, the DNS issues start showing up again, and before you know it.. everything is down.  I have tried to nail this down to a particular switch, or group of computers etc.. but there is really no rhyme or reason.. it doesn't matter which group of computers I choose, or which switch or access point.. that I try..  after so many computers join in.. we have DNS issues.  I have used nslookup and it seems that when I am having this issue.. that I get "timeout" messages and then it also doesn't want to show the name of the DNS server (which is our Domain controller).. when I restart the DNS server service a few times, it seems to kick back in and work for awhile until so many stations are back on, then we're out again.  I guess I'm looking for any suggestions that could help me identify the problem.. I do have a second DC on the network which has DNS installed on it.. but I'm not sure if it is correctly setup to be a "backup" DNS server if the main DNS server is stopped..  This could possibly help as well.  Thanks for listening.. any help is appreciated.  - Bill
Question by:William Larkin
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +2
LVL 36

Expert Comment

ID: 39890475
Usually without much configuration, the second DC should already be up and running. Just use nslookup on clients, pointing to this backup DNS, and see if it resolves correctly (both internal and external addresses). If so, add this as a second DNS server in your DHCP settings. Maybe now with less load on the main DNS server, the server fails less often.
LVL 57

Assisted Solution

giltjr earned 1268 total points
ID: 39890779
Are the host names you can't resolve internal or external to your network.

If external, it sounds like you your DNS servers are having problems connecting to the Internet to resolve names.

In your DNS servers do you code forwarders or do you rely on the root hints?

Either way I would double  check your Internet connection.

Author Comment

by:William Larkin
ID: 39891417
Our DNS is working (internal and external) and I can use nslookup to check without fail.. BUT.. when I start turning the our lab computers (it's at a school), the DNS stops working for everyone? I've thought of viruses, etc.. but the crazy thing is that it doesn't matter which lab (we have 3) is started, the whole network loses Internet connectivity. When I turn those lab computers off, the Internet and DNS works again. There are approximately 40 computers other than the lab computers than will continue to work properly without issues.

I know this is a lot of reading, but I'm stumped.. I'd pay to have someone help me resolve this issue.. It's kinda hard to describe everything here and what I've have already done to troubleshoot. Thanks.
Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

LVL 57

Accepted Solution

giltjr earned 1268 total points
ID: 39891614
Question.  You state you lose both DNS and Internet.  Do you really lose Internet?  Meaning, if you know the IP address of a web site on the Internet, say like one of Googles's is, and you enter just the IP address can you get to Google?

If you can NOT get to Google just by IP address, then you can ignore the DNS issue for now.  The DNS issue is because you loose Internet.  

If you can get to Google, then the problem is DNS and may not have anything to do with Internet connectivity.

Could  it be possible that one of the LAB computers has a duplicate IP address?

I would suggest that you turn on the LAB computers one by one and check for DNS/Internet connectivity after each one is turned on.   My guess is that it is one specific one that is causing the problem.

You may need to run a packet capture, I suggest Wireshark for this, to look for "things."

Things would include, did the MAC address for the DNS server or your default router/gateway change after one of the LAB computers was turned on.
LVL 18

Assisted Solution

Akinsd earned 368 total points
ID: 39891679
I think you are creating a DoS issue all by yourself. Denial of Service is what hackers use to shut or impair service. This works by sending too many DNS queries to the DNS server which then gets buggled up and freezes.

You seem to have identified part of the problem being your lab computers.

By the way, backup DNS does nothing until the primary DNS fails or shuts down (ie does not respond to queries). To use both at the same time, you need to make 1 primary for some PCs and the other primary for other PCs.

In this case, I would make the backup DNS the primary for the lap PCs.

Also, the performance monitor on your DNS, you may need to increase the memory in addition to other optimizations (cleanup, defragmentation etc)
LVL 27

Assisted Solution

skullnobrains earned 364 total points
ID: 39892863
if the above dos theory is correct, you may want to check your firewall. it is very frequent that firewalls are configured to handle dns with so-called persistent udp sessions. these sessions do not know about the protocol and last until their timeout (10-60 seconds is common), meaning each dns query blocks a port for that much time.

Author Closing Comment

by:William Larkin
ID: 39895669
Hi.. Thanks for ALL Comments.. everyone was a help in this matter and I used your suggestions to look into things and to troubleshoot the issues.  It looks like the main cause of the problem was in fact persistent UDP sessions (broadcast storm) which I had no clue to until I contacted our firewall vendor and via a remote session the problem was discovered. It seems that many.. UDP broadcasts are being transmitted and the firewall is dropping the "flood" of packets.. when the threshold was raised, the internet started flowing again. Also, as was mentioned in comments above.. I never lost connectivity.. I just couldn't use a browser to get to a website as DNS wasn't resolving anything. I could always 'ping' an external site.. such as our ISP's DNS servers.  Lastly, there is a program that we use here at the school that seems to be the source for these broadcasts.. so.. on to my next challenge.. stopping the storm.   Thanks Again! - Bill
LVL 57

Expert Comment

ID: 39895806
One simple way (at least in theory its simple) is to create a small routing only subnet between your internal network and your firewall using another L3 device.

That way your firewall only sees traffic that is either going to or coming from the Internet.  It will never see traffic that stays on your internal network, which means it will never see the broadcasts.
LVL 27

Expert Comment

ID: 39896213
Lastly, there is a program that we use here at the school that seems to be the source for these broadcasts

if you give details, we may be able to help further

for example, when it comes to dns sessions (not your case but might give you a few hints), common ways to make things better include
- use upstream resolvers (probably not applicable to your app)
- use a fixed source port (at least you have one session per remote host:port tops)
- de not use statefull firewalling for this specific traffic (make it old school one rule for the outgoing packets and another for the answer)
- change the session duration (unfortunately this is usually applicable for all udp traffic and not rule by rule)
dumber dirty hacky workarounds include
- send a port unreachable from the dns server when you receive the answer (easy in dns because answers are always the last packet of the session)... AND accept the packet nevertheless : dns works and the port-unr instructs the firewall to kill the udp session. most firewalls will recycle the port at once or after a much shorter grace time.

note that if the sessions are actually internal traffic, @giltjr gave you a much better way to work things out.

if the lan segments are mostly open to each-other setting the interlan traffic to be accepted by default will (if supported by your firewall) most likely not create sessions for any traffic that you don't allow explicitely... and you can always create a final rule that blocks everything except udp on that specific port back and forth. if you're not a security freak, this is also workable lan-wan in some cases

Featured Post

Q2 2017 - Latest Malware & Internet Attacks

WatchGuard’s Threat Lab is a group of dedicated threat researchers committed to helping you stay ahead of the bad guys by providing in-depth analysis of the top security threats to your network.  Check out our latest Quarterly Internet Security Report!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article is in regards to the Cisco QSFP-4SFP10G-CU1M cables, which are designed to uplink/downlink 40GB ports to 10GB SFP ports. I recently experienced this and found very little configuration documentation on how these are supposed to be confi…
This month, Experts Exchange’s free Course of the Month is focused on CompTIA IT Fundamentals.
Michael from AdRem Software outlines event notifications and Automatic Corrective Actions in network monitoring. Automatic Corrective Actions are scripts, which can automatically run upon discovery of a certain undesirable condition in your network.…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question