Link to home
Start Free TrialLog in
Avatar of William Larkin
William LarkinFlag for United States of America

asked on

DNS Issues

Hi.. I have an issue that is puzzling.. I manage a network of 250+ computers that 80% of the computers are wireless.  Recently, I have been having issues with computers not being able to connect to the internet.. stating that DNS is the issue.  I never seem to have a problem connecting to the wireless network and can connect to different computers internally, but the internet is the problem.  The strange thing is, that when I "reboot" the servers and restart the switches (with limited clients logging in) everything seems to operate without a problem.  When more computers are powered on and login, the DNS issues start showing up again, and before you know it.. everything is down.  I have tried to nail this down to a particular switch, or group of computers etc.. but there is really no rhyme or reason.. it doesn't matter which group of computers I choose, or which switch or access point.. that I try..  after so many computers join in.. we have DNS issues.  I have used nslookup and it seems that when I am having this issue.. that I get "timeout" messages and then it also doesn't want to show the name of the DNS server (which is our Domain controller).. when I restart the DNS server service a few times, it seems to kick back in and work for awhile until so many stations are back on, then we're out again.  I guess I'm looking for any suggestions that could help me identify the problem.. I do have a second DC on the network which has DNS installed on it.. but I'm not sure if it is correctly setup to be a "backup" DNS server if the main DNS server is stopped..  This could possibly help as well.  Thanks for listening.. any help is appreciated.  - Bill
Avatar of Kimputer
Kimputer

Usually without much configuration, the second DC should already be up and running. Just use nslookup on clients, pointing to this backup DNS, and see if it resolves correctly (both internal and external addresses). If so, add this as a second DNS server in your DHCP settings. Maybe now with less load on the main DNS server, the server fails less often.
SOLUTION
Avatar of giltjr
giltjr
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of William Larkin

ASKER

Our DNS is working (internal and external) and I can use nslookup to check without fail.. BUT.. when I start turning the our lab computers (it's at a school), the DNS stops working for everyone? I've thought of viruses, etc.. but the crazy thing is that it doesn't matter which lab (we have 3) is started, the whole network loses Internet connectivity. When I turn those lab computers off, the Internet and DNS works again. There are approximately 40 computers other than the lab computers than will continue to work properly without issues.

I know this is a lot of reading, but I'm stumped.. I'd pay to have someone help me resolve this issue.. It's kinda hard to describe everything here and what I've have already done to troubleshoot. Thanks.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi.. Thanks for ALL Comments.. everyone was a help in this matter and I used your suggestions to look into things and to troubleshoot the issues.  It looks like the main cause of the problem was in fact persistent UDP sessions (broadcast storm) which I had no clue to until I contacted our firewall vendor and via a remote session the problem was discovered. It seems that many.. UDP broadcasts are being transmitted and the firewall is dropping the "flood" of packets.. when the threshold was raised, the internet started flowing again. Also, as was mentioned in comments above.. I never lost connectivity.. I just couldn't use a browser to get to a website as DNS wasn't resolving anything. I could always 'ping' an external site.. such as our ISP's DNS servers.  Lastly, there is a program that we use here at the school that seems to be the source for these broadcasts.. so.. on to my next challenge.. stopping the storm.   Thanks Again! - Bill
One simple way (at least in theory its simple) is to create a small routing only subnet between your internal network and your firewall using another L3 device.

That way your firewall only sees traffic that is either going to or coming from the Internet.  It will never see traffic that stays on your internal network, which means it will never see the broadcasts.
Lastly, there is a program that we use here at the school that seems to be the source for these broadcasts

if you give details, we may be able to help further



for example, when it comes to dns sessions (not your case but might give you a few hints), common ways to make things better include
- use upstream resolvers (probably not applicable to your app)
- use a fixed source port (at least you have one session per remote host:port tops)
- de not use statefull firewalling for this specific traffic (make it old school one rule for the outgoing packets and another for the answer)
- change the session duration (unfortunately this is usually applicable for all udp traffic and not rule by rule)
dumber dirty hacky workarounds include
- send a port unreachable from the dns server when you receive the answer (easy in dns because answers are always the last packet of the session)... AND accept the packet nevertheless : dns works and the port-unr instructs the firewall to kill the udp session. most firewalls will recycle the port at once or after a much shorter grace time.

note that if the sessions are actually internal traffic, @giltjr gave you a much better way to work things out.

if the lan segments are mostly open to each-other setting the interlan traffic to be accepted by default will (if supported by your firewall) most likely not create sessions for any traffic that you don't allow explicitely... and you can always create a final rule that blocks everything except udp on that specific port back and forth. if you're not a security freak, this is also workable lan-wan in some cases