Link to home
Start Free TrialLog in
Avatar of SusanSB
SusanSB

asked on

NSLOOKUP normal, but certain machines not resolving anyway

I am in the process of migrating my users from an NT network to a Win2003 Network.  Those users who are still on the NT network point to three corporate DNS's outside our facility and system, and all of them have private IP's.  Recently, users on the NT network started calling me to report they were offline.  When I sit down at their machines, I can see they are online, but not resolving.  I can ping by number but not by name, and can browse by number but not name. If I do NSLOOKUP, I get responses from the DNS servers, but I still cannot ping by name or load web pages by name. Rebooting does not change anything. The DNS folks seem to think it is at my end, but I am having trouble finding what is unique to the particular users having the problem (about 8 out of 75), and why it started when it did, as nothing had changed. The rest of my users are not having this problem even though they all point to the same DNS's.

No events are pointing to any DNS problems on the computers.  How often it happens varies.  A couple are off nearly all the time, others do it every week or two. Each is unique and frequency varies.  They pop online at odd times, no discernible time or activity causes them to start resolving.

I installed ethereal on one machine and ran pings and NSLOOKUPs while it was resolving and again while it was not, here is the result:

Resolving:  I see queries go out and responses come back for pings (Standard query A and Standard response from the DNS with the IP), then the four ICMP ping request and responses, and I see NSLOOKUPs that look normal (Standard query A, Standard Response with the IP).  I also see a tons of Standard query PTR for internal private IPs, which appears to go on all the time, and responses of "no such name".

Not resolving:  When I ping, I see the queries go out (Standard query A), but nothing comes back. It tries each DNS in the list and gets no response, then no ICMPs.  NSLOOKUPs look as though everything is peachy.  I see Standard query A go out and Standard Responses come back with the correct IP.  I also still see the Standard query PTR for internal private IP's and Standard responses "No such Name".  

It turns out that locations all over the country are having the exact experience I am having, just a few users at each, and their descriptions are identical to my problem. NSLOOKUPS always work, even when there is clearly no resolving going on for pings or web browsing.  I added a registry setting to increase the DNS query timeout and an adapter timeout, but it has made no difference.  If I migrate one of these users to the Win2003 network, where they point to our local DNS that uses the corporate DNS's as forwarders, the clients no longer have a problem.  However, some of these machines have to stay on the NT network for now and I am getting nowhere.  All the clients are Windows XP, all patched and with current virus definitions.  The DNS folks seem to be hanging us out to dry - not their problem.  I am plumb out of ideas.  I have tried IPCONFIG to renew their DHCP numbers, set fixed IPs on them, flushed their DNS caches and reinstalled networking on them.  I have found nothing that sets these machines apart from the 60+ that are working normally.  
Avatar of mav7469
mav7469

I know this is going to sound strange.. I am guessing that you are not running WINS.  Or, make sure that your DHCP is set to NOT use Netbios.  By default, DNS in an AD environment will try to use Netbios (WINS) first because that is how it authenticates to the domain.  Give that a try and let us know.

Good Luck

Mav
Try appending the windows domain name to the end of the host name.

IE: ping host.domain.local instead of just plain old ping host
Avatar of SusanSB

ASKER

You are correct that I am not running WINS, but none of the affected machines are in the Active Directory either.  They are logging on to an NT network that is not part of the Active Directory, and the DHCP is on NT.  When I do log one of these onto the AD and point them to the local DNS (with the off-site DNS's as forwarders), the problem goes away.  The DHCP is set to not use Netbios.  Am I understanding you correctlY?
Avatar of SusanSB

ASKER

The pinging is to any external site - say, time.nist.gov.  If the machine is not resolving and I ping time.nist.gov, the ping fails after a long pause.  If I ping 192.43.244.18, I get replies.  I determined from the packet capture that the long pause before the fail seems to be because it is querying the DNS's in order and getting no DNS responses.  Am I answering the question you are asking?
flush DNS cache on your local servers and workstations having issues
ASKER CERTIFIED SOLUTION
Avatar of Ken Conradie
Ken Conradie
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Are you restricting access to the 2K3 DNS to ONLY members of that domain?  Otherwise, why not just set the DHCP server to hand out the address for your internal DNS (with forwarders)?

Are there possibly any restrictions on the DNS servers that wont resolve for you?
Avatar of SusanSB

ASKER

Since I am committed to try to fix this at the client level using the external DNS's, I tried conradie's suggestion first to restart the DNS Client Service.  I had two machines drop this week, and the result was the same on both:  Restarting the DNS Client Service made them immediately start resolving.  I set both machines to disable the service, and now have to wait to see if they continue to resolve -- excrutiating when the problem happens so sporadically. With either of these clients, there could be two weeks that pass before I get the dreadeed call.  Or not.

A second location has done the same with two clients, and had the same experience, so we both are waiting with baited breath and, in the interim, also researching the implications of disabling the service.  What is the down side of disabling the DNS Client Service in WinXP?  MS says "Note The overall performance of the client computer decreases and the network traffic for DNS queries increases if the DNS resolver cache is deactivated. " (http://support.microsoft.com/kb/318803/en-us).  That certainly sounds ominous...
The only downside is that you will no longer cache lookups.  Every name that needs resolution will make a call to DNS.  So if you browse to EE, close your browser and then go to EE again, you will make a DNS call both times, rather than only the first.

I wouldn't go so far as to say "ominous", it'll add a little delay while you wait for a DNS resolution, and the added network traffic of more DNS calls.
Exactly aseusainc. Working in IT, I often want to make sure that I am using the latest changes in DNS and am not using cached info, and so I keep the service disabled on my workstation for this reason. Its pretty far from ominous.
And... because nslookup goes directly to the DNS server and ignores the client cache, thats why you see normal behaviour on the clients when using nslookup. With only 75 total users on your network, even if you disabled the service on ALL of them I doubt it would cause any noticable traffic problems on your network.  
Sorry for the string of posts.... One more "permanent fix" idea and I will shut up... : )
I have seen a TCP reset fix this permanently for at least one machine. Check out the link below for instructions.

http://support.microsoft.com/default.aspx?scid=kb;en-us;299357

Avatar of SusanSB

ASKER

Hey conradie - so far, so good on stopping the DNS client.  So far, so good.  I will save the "permanent fix" for the first machine to come up doing it again.  It has only been a week now, so I may be back later with the stubborn ones.  But so far, so good.  Another location tried it too, with user machines that were off nearly all the time, and so far his clients are on too.

Thanks to all of those who contributed!
Glad to hear it. Thanks for the points!