Link to home
Start Free TrialLog in
Avatar of crcsupport
crcsupportFlag for United States of America

asked on

DNS request steps and problem to resolve only one external dns hostname

I'm really struggling to find out why this happens.

DNS server: two local windows 2003 SP 2 DNS servers(FS1 and FS2)
each workstations configured to use FS1 as primary DNS server, FS2 as secondary DNS server.

I have problem to nslookup one specific hostname which is 'service101-us.mimecast.com'.

When I nslookup the hostname from the two DNS server consoles, it resolves with no problem.

But when I nslookup from any workstation, it fails. I run nslookup to FS2 to bypass the primary DNS server by typing 'nslookup - FS2', then nslookup, it successfully resolves.

So the problem seems to be the FS1, somehow it can't resolves the specific hostname. It happens daily, so I have to restart DNS service on FS1. then the problem goes away.

Does anyone know why FS1 can't resolve only one specific external DNS hostname until restarting dns service??
Avatar of crcsupport
crcsupport
Flag of United States of America image

ASKER

> service101-us.mimecast.com
Server:  FS1.mydomain.local
Address:  192.xxx.xxx.xxx

DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
*** Request to FS1.mydomain.local timed-out
>
SOLUTION
Avatar of Mike Roe
Mike Roe
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
No forwarders, only root hint
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks. since I restarted DNS, it works again. I'll wait  a day or two and when problem occurs, I'll try options you guys suggested. It happens again in less than two days usually.
I'll keep this thread open till then.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That's good idea, gt2847c. I'll do that
Following is the nslookup debug output while it works.

> set debug
> service101-us.mimecast.com
Server:  fs1.mydomain.local
Address:  192.xxx.xxx.xxx

------------
Got answer:
    HEADER:
        opcode = QUERY, id = 6, rcode = NXDOMAIN
        header flags:  response, auth. answer, want recursion, recursion avail.
        questions = 1,  answers = 0,  authority records = 1,  additional = 0

    QUESTIONS:
        service101-us.mimecast.com.MYDOMAIN.LOCAL, type = A, class = IN
    AUTHORITY RECORDS:
    ->  mydomain.local
        ttl = 3600 (1 hour)
        primary name server = fs1.mydomain.local
        responsible mail addr = hostmaster
        serial  = 11665
        refresh = 900 (15 mins)
        retry   = 600 (10 mins)
        expire  = 86400 (1 day)
        default TTL = 900 (15 mins)

------------
------------
Got answer:
    HEADER:
        opcode = QUERY, id = 7, rcode = NXDOMAIN
        header flags:  response, auth. answer, want recursion, recursion avail.
        questions = 1,  answers = 0,  authority records = 1,  additional = 0

    QUESTIONS:
        service101-us.mimecast.com.MYDOMAIN.LOCAL, type = AAAA, class = IN
    AUTHORITY RECORDS:
    ->  mydomain.local
        ttl = 3600 (1 hour)
        primary name server = fs1.mydomain.local
        responsible mail addr = hostmaster
        serial  = 11665
        refresh = 900 (15 mins)
        retry   = 600 (10 mins)
        expire  = 86400 (1 day)
        default TTL = 900 (15 mins)

------------
------------
Got answer:
    HEADER:
        opcode = QUERY, id = 8, rcode = NOERROR
        header flags:  response, want recursion, recursion avail.
        questions = 1,  answers = 1,  authority records = 0,  additional = 0

    QUESTIONS:
        service101-us.mimecast.com, type = A, class = IN
    ANSWERS:
    ->  service101-us.mimecast.com
        internet address = 207.211.31.80
        ttl = 288 (4 mins 48 secs)

------------
Non-authoritative answer:
------------
Got answer:
    HEADER:
        opcode = QUERY, id = 9, rcode = NOERROR
        header flags:  response, want recursion, recursion avail.
        questions = 1,  answers = 0,  authority records = 1,  additional = 0

    QUESTIONS:
        service101-us.mimecast.com, type = AAAA, class = IN
    AUTHORITY RECORDS:
    ->  mimecast.com
        ttl = 86388 (23 hours 59 mins 48 secs)
        primary name server = dns01.mimecast.com
        responsible mail addr = root.mimecast.com
        serial  = 111540
        refresh = 10800 (3 hours)
        retry   = 900 (15 mins)
        expire  = 604800 (7 days)
        default TTL = 86400 (1 day)

------------
Name:    service101-us.mimecast.com
Address:  207.211.31.80

>
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Isn't is normal to have non-authoritative answer? Our AD integrated DNS server keeps DNS records of our internal domain only. Any request needs to go out to root hints to find which will return non-authoritative answer.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Umm... That may explain the problem because it happens daily at some point. After I restart the DNS server, I manually resolve the hostname using nslookup. The  exchange server I'm having problem with DNS resolution to the hostname never contacts DNS server or outgoing email doesn't kick in to resolve the recipient mail server hostname (service101-us.mimecast.com).

The initial problem was because the recipient mail server locks out our outgoing mail to it. I found it's because they run grey listing at their spam filter. With further research, there's some glitch between exchange server older than 2010 and grey listing.

I thought I fixed the problem modifying registry key 'GlitchRetry' in exchange server, but it seems like it still has problem on DNS side.

I like to test quickly by flushing the DNS cache, but to be safe, I'll let you tomorrow. :)
Again, 3 emails were stuck in our exchange server. NSlookup to the recipient mail server was timing out. This time, I cleared cache in DNS server FS1, then I forced connection in exchange server, emails went through.

I read online article about email being stuck in cache preventing proper DNS service.

Tomorrow, I'll see if clearing the cache helped for long run.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I did ipconfig /flushdns on exchange server, it didn't help.
I checked this morning again, the problems till exists. So stale record in DNS server's cache doesn't seem to be the culprit.
I added Google DNS 8.8.8.8 as a forwarder and see what happens tomorrow.
If this doesn't work, the last option that I have may be add manual record for mimecast in host file. I spoke to mimecast support, they seem as they haven't noticed this problem. I don't think I'm the only one having such problem.
Maybe others using exchange 2003 to send emails to mimecast and others expecting to receive emails from exhcnage 2003 sender don't really care about when the emails go through. But our client who uses mimecast email service calls us if they don't receive emails in 10 minutes after we sent. The stuck emails in queue go through usually after 40 minutes.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I don't see any error on exchange log, it just shows normal process for the stuck email in queue.
1019,1020,1031,.1033,1034.

It doesn't show why it got stuck in queue and stay as active.

I forgot about running debug for DNS when I saw the problem.
I added forwarder, so will wait until Monday morning if email gets stuck again and this time I should not forget to run it.
It looks like it was resolved. I didn't see the email stuck for 3 days. I found a news that Mimecast had DNS server problem around mid May, but it was in UK. I don't know if it somehow affected our DNS server's resolution without forwarder.

Thank you all!!!