Solved

DNS / DHCP issues

Posted on 2012-03-29
20
490 Views
Last Modified: 2012-03-30
We have about 60 devices on our network.  About 2 weeks ago we started having intermittent issues with about 10 devices - they would lose internet connections.  Originally we thought it was an issue with a switch.  But after moving the devices to another switch the problems continued.  These devices are in 2 different buildings.  We have isolated the problem to dns/dhcp - not sure which is really causing the issue.  I don't know what is happening or how to fix it but when the issue occurs I'm finding an error message for dhcp that says IP lease for XX has been denied by the DHCP server 192.168.1.1.  Right off the bat - the server IP is wrong.  I don't know why it is suddenly using 1.1.  In addition if you do a nslookup it also can't find the correct server - it says non-existent domain.

Some of the pc's having an issue get their IP address from dhcp, some have a reservation and some have a static IP.  It doesn't seem to matter which - the same devices go down no matter how their IP is given out.  Again - it is always the same 10 or so devices - everyone else is fine.  And it is happening in 2 different buildings.

Any ideas?
0
Comment
Question by:cindyfiller
  • 10
  • 9
20 Comments
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37783588
What kind of network devices are you using in the buildings with trouble?

The 192.168.1.1 sounds like a default IP for a comodity (read: home use) network NAT/firewall type device.  These typically run DHCP servers by default.  You may have a rogue DHCP server on your network, which would result in all the problems you're describing.

Perhaps a brief description of your network infrastructure would help...
List the sites
List the IP-aware networking hardware at each site (switches, routers, wireless access points, firewalls)
Indicate the IP subnet of each site
Indicate the preferred DHCP server
Indicate the preferred DNS servers

This might help me help you isolate the problem.
0
 

Author Comment

by:cindyfiller
ID: 37783621
I was thinking the same thing (that something else is trying to do dhcp) but not sure how to track it down.  Our 2 buildings are connected via fiber.  We are all on the same subnet.  We are behind a Sonicwall firewall.  We have 6 switches in this building and 3 in the other.  We have one server that does dhcp and two that do dns.  (We originally had 2 doing dhcp but when we installed vmware our vendor recommended just have one server do the dhcp.)  The primary server doing dhcp and dns is a new install (nov) of windows server 2008 and is used just for active directory, dhcp and dns.  

Hopefully this answers all of your questions.
0
 
LVL 6

Accepted Solution

by:
netjgrnaut earned 500 total points
ID: 37783714
Has anyone added wireless to the network lately?  That's what I'd be looking for...  Do you have wireless at all?

Clients in both buildings are being impacted, then?  You mentioned two buildings in your first post, and say you only have two buildings in your last post... so I'm making an assumption here.

What do the 10 (or so) devices with intermittent problems have in common?

Normally, a Windows DHCP server will detect a rogue DHCP server on the wire and log errors to that effect.

Let's look at how you resolve the issues on a case-by-case basis.  When a client is in an error state, you should run "ipconfig /all" (in a cmd window) and take note of the IP address of the DHCP server and the DNS server, as well as the IP address assigned to the host itself.

Then, you can "arp -a" and note the MAC addresses of the DHCP and DNS server IP(s).

If your network switches are managed, you can then start looking for these MAC addresses in the switch tables.  This will lead you to the port on which the rogue is connected.

If you post more specific info from a client in error, I can perhaps help more.  Post the output of ipconfig /all when it's failing nslookup etc.

Knowing your working IP subnet would help.  Obviously it's not 192.168.1.0/24...?  Or is it...?

Specific IP addresses of your DHCP and DNS servers, as well as the LAN port of the SonicWALL (which I assume is the default gateway handed out by DHCP) woud also help in troubleshooting.
0
 

Author Comment

by:cindyfiller
ID: 37783864
We are all on 192.168.68 and provide ip's in the range from 1 - 256.  1 is the firewall.  All of the pc's that are having issues are windows xp machines.  Most of our devices are xp - but it is odd that only the xp machines are having the issue.  I have been doing ipconfigs and thought those looked ok but in hindsight I was checking the IP and the computer name so I may not have even looked at that.  I  could still ping the pc's having the issue.  And I have been checking the mac addresses when I did the ipconfig and have been doing nbstsat -A to confirm device names assigned to that IP address.

We have 3 wireless routers - inexpensive Cisco ones.  They've been in place for about a year.  However... now that you asked about this it brought back something that happened right before all of these issues started.  The one router was moved from one location to another in this building temporarily so another conference room could have wireless access.  It was short term use so we just unplugged the printer from the jack and plugged the router in that jack.  (I gave those routers a reserved IP because they were originally grabbing an IP that was already in use.)  

Unknown to me someone wanted to use the printer so they plugged the printer into the router.  I just found out about that 10 minutes ago.  I had the student move that router about 10 days ago back to its original location.  When he plugged the printer back in, the printer itself had the IP of 192.168.1.1...  we went back and entered the correct info for the printer and it has been working fine since.  That router is currently plugged in and I would love to pull it but a consultant is here right now using that wireless connnection...  I hope to pull it in about 30 minutes when he takes a break.  

We are in 2 buildings only and the issue is in both buildings... more so in the other building that this one which is strange since the wireless device was here.

I can send you some of the dhcp logs - was looking at those, too, if you like but thought the wireless may be the issue.
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37784012
A couple quick notes - assuming your subnet mask is 255.255.255.0, 192.168.68.256 is not a valid address.  Your DHCP scope should be handing out 1-254 *at most* - and if so, should have .1 excluded, as well as all your static server IP addresses.  I'm sure that's what you meant... just wanted to be sure.  ;-)

Do all of the XP machines that experience this problem have wireless adapters?  That's what I'd be looking for (I think).

I would review all of your access points to ensure that DHCP server is *not* enabled on any of them.  If one went through a factory default reset (intentional or otherwise), it may have enabled the internal DHCP server.  What model of Cisco wireless routers?  Any (out of curiosity) why routers?  Why not just access points?  Typically routers will have a different subnet for the wireless clients than the IP subnet of the wired interface.  It's possible to circumvent this (I've done it plenty of time, on the Cisco Valets) - but it's prone to error.

Let's back up... (though I think we're on to something with the wireless/rogue access point).  What's the actual failure at the client level?  Please indicate which of the following statements are *false*...

XP client loses Internet connection.  (We know this because the next link clicked returns a "page not found" error.)
At the XP client, running IP config shows an IP address of 192.168.68.x
At the XP client, running IP config shows the correct DHCP server address
At the XP client, running IP config shows the correct DNS server address(es)
At the XP client, running IP config shows the correct default gateway address
At the client, I can ping an internal server by IP address
At the server (or another client), I can ping the 192.168.68.x address of the affected client
At the client, I can ping the server by name *without* including the domain
At the client, I can ping the server by name *including* the domain (FQDN)
At the client, I can ping the Internet by IP (try 8.8.8.8)
At the client, I can ping the Internet by name (FQDN - try ns1.google.com)

The DHCP events point to something, but they may not have anything to do with your client problem.  It might just be the printer...
0
 
LVL 11

Expert Comment

by:sighar
ID: 37784052
Run DHCPloc.exe which tells you right away if you've got a rogue DHCP server on your hands - it really sounds like it.

You'll find the utility on the Windows Server 2003 CD in the Support Tools folder.

See here for ways to use it: http://technet.microsoft.com/en-us/library/cc778483(v=ws.10).aspx
0
 

Author Comment

by:cindyfiller
ID: 37784128
Wanted to provide some quick answers and will do more in a bit.  The router we bought is a linksys E1000... it was what a vendor recommended to provide us with a wireless access point.  Don't know why that vs something else!



XP client loses Internet connection.  (We know this because the next link clicked returns a "page not found" error.)  I can answer some of the questions below - but not all...  


At the XP client, running IP config shows an IP address of 192.168.68.x  YES


At the XP client, running IP config shows the correct DHCP server address  ??


At the XP client, running IP config shows the correct DNS server address(es)  ??


At the XP client, running IP config shows the correct default gateway address YES


At the client, I can ping an internal server by IP address  YES


At the server (or another client), I can ping the 192.168.68.x address of the affected client  YES


At the client, I can ping the server by name *without* including the domain  YES


At the client, I can ping the server by name *including* the domain (FQDN)  YES


At the client, I can ping the Internet by IP (try 8.8.8.8)  NO


At the client, I can ping the Internet by name (FQDN - try ns1.google.com)  NO

We have found that the internet goes down and for some of the people outlook continues to work.  We are on exchange 2010 and all of the pc's have office 2007.  Some of the people can still access some other apps we have internally - but others cannot.  In thinking about it those that had a static ip or reserved ip were more likely to successfully access our other apps than those that were getting the ip from dhcp.  

More pieces are falling in place!
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37784177
A-ha! (Perhaps.)

Are you saying that Internet connectivity is interrupted for both static and DHCP assigned clients?

That would rule out DHCP as the source of the problem...

What model of Sonicwall do you have?
0
 

Author Comment

by:cindyfiller
ID: 37784218
Yes - some of the computers got their ip from dhcp - some had reservations.  I went back and added static ip, dhcp,dns to those having an issue with the hopes it would help - but they are still going down.  At least some of them are...

We have an older sonicwall - 3060.
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37784259
Now we're getting somewhere!

Assignments are still DHCP assigned addresses.  Static means configured by hand directly on the client. Sounds like you have some of both.

Have you added exclusions to your DHCP server for all the static IPs? If not, the DHCP server may try to allocate those same addresses to dynamic clients - causing duplicate IPs to appear on the network.

I have a suspicion about the SonicWALL - but I need to do a little research first...
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 

Author Comment

by:cindyfiller
ID: 37784337
I did not add exclusions for the static ip's on the pc's.  I have done that for all of the servers and the switches - but not pc's.  

We did just get back into the setup for that wireless router and it was set to do dhcp.  Don't know when/how it was changed... but I do believe that was the issue.  It'll be interesting to see over the next day or two if that was the problem.  I'm going to keep it disconnected for now.  I wish I could access them via the ip addres - can't.  I'll do some reading to see if it can be done.

We will be moving into a new building in about 6 months.  We'll be replacing the sonicwall with another firewall at that time.
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37784350
I was wrong about the SonicWALL - there was a model that limited the max concurrent outbound session, much like the Cisco PIX 5505 does.  But yours isn't it.

You could have a combination of issues.  I would either add exclusions for the PCs with static IPs, or set them back to DHCP, to ensure that you don't have duplicate IPs as one aspect of your issue.

Keeping the rogue DHCP server off the network is a Good Thing, too.  But from what you describe, that can't be the only thing that's going on...

Hopefully I've given you enough things to look out for, as well as some additional info to capture when/if you have this problem again (having made the two changes that we've discussed).

Good luck!
0
 

Author Comment

by:cindyfiller
ID: 37784378
Appreciate your help and the speed in replying.  I'll award the points tomorrow... want to just track the situation for a day to see if things are stable - in case I have to ask some more questions!
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37784382
No worries!  I've got a slow couple of days at work this week.  Plenty of time to give back...
0
 

Author Comment

by:cindyfiller
ID: 37784411
I'm going to toss out an unrelated question.  When I was looking at the dhcp logs I noticed that the ipads and iphones were grabbing IP's.  I was surprised - especially the iphones.  Didn't realize they would connect to our network... guess it shouldn't surprise me.
0
 

Author Comment

by:cindyfiller
ID: 37784458
BTW, I do have the ipconfig of someone who was having issues earlier today.  The ip address, subnet, default gateway, wins and dhcp server were all correct.   The DNS server was incorrect - it had 192.168.1.1.   Now I'm wondering if we have resolved the issue!
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37784481
Sure. There's wifi built into both. I take it your end users know your WPA PSK (or... gasp... WEP key)? If you broadcast your SSID, the default setting of the iStuff is to prompt the user to join the network.

Another possible source of IP duplication. The more devices looking for DHCP allocation, the more quickly the subnet in your scope will recycle (hand out previously leased addresses once the lease timer is up).

Your wireless network name isn't LINKSYS by any chance? That could explain a lot about the "declined ... 192.168.1.1" entries in your DHCP log...
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37784510
If the DHCP server was correct, and the DNS was assigned that way, you need to review your DHCP scope and server settings.

You should check that client to see if there's a static DNS server in addition to the "get address from DHCP" setting.

The incorrect DNS server must be coming from on of these two places. Unless your rogue DHCP server is misconfigured *and* shares the same IP as your "real" server.  That would cause other, site wide issues (assuming that your DHCP server provides services other than DHCP).
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 37784516
A thought... When confronted with this situation, ping all the addresses shown in the ipconfig output - then do an arp -a. Check this against your actual server MAC addresses. Might show you something.
0
 

Author Closing Comment

by:cindyfiller
ID: 37788935
This was an excellent exchange of information/troubleshooting steps.  It turned out that the wireless router we were using some how had been changed to provide dhcp.  Once I pulled it off the network everything has been stable.  I appreciate this person's troubleshooting skills so much!
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Have you considered what group policies are backwards and forwards compatible? Windows Active Directory servers and clients use group policy templates to deploy sets of policies within your domain. But, there is a catch to deploying policies. The…
BIND is the most widely used Name Server. A Name Server is the one that translates a site name to it's IP address. There is a new bug in BIND (https://kb.isc.org/article/AA-01272), affecting all versions of BIND 9 from BIND 9.1.0 (inclusive) thro…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now