[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 467
  • Last Modified:

DNS Failover

I have multiple DNS servers with DHCP scope options pointing to two of my DNS servers. The other day I took one of my DNS servers off line for maintenance thinking that the second DNS server would pick up the slack while I had the other server down. It didn't.  I'm just wondering why this second server didn't continue serving the zones.  I have my SOA expiration time frame set to 1 day so shouldn't the secondary have continued to answer requests on my zones for that amount of time?  I'd like to be able to bring either server down for maintenance and not have to worry about this again.  These are internal servers answering internal requests.
0
jrobison
Asked:
jrobison
  • 6
  • 3
1 Solution
 
jrobisonAuthor Commented:
I should also mention that my DNS servers are setup as Active Directory-Integrated.
0
 
Rich WeisslerProfessional Troublemaker^h^h^h^h^hshooterCommented:
Restating problem to make certain I understand: So server1 and server2 are both Windows Domain Controllers, both run DNS, and server Active Directory-Integrated zones.  You have client machines which receive the IP addresses for each of the two DNS servers via DHCP.  You had Server1 down for maintenance.

At that point, what should have happened was - All the workstations would have tried to connect to Server1 for DNS resolution, timed-out, and tried the second DNS server.  The timeout before going to the second server is noticeable.

If that doesn't seem to have happened, from one of the workstations, start with the obvious.  Confirm with ipconfig /all, that both DNS servers are configured.  Use 'nslookup', 'server ipaddressofServer2', 'ServerNameToLookup' to make certain that server is responding as you expect.

If my restatement of the problem is incorrect however... then I'm barking up the wrong tree.  In which case, please help me understand what happened.
0
 
jrobisonAuthor Commented:
Both servers are DC's and both are running 2008 R2. Both servers are configured for AD integrated zones. The DHCP scopes are configured with both IP's. The server I took down for maintenance was set as the primary in the DHCP scopes (well it was listed first so it was set as the users primary).  

Everything resolves perfectly including dcdiag tests.  

I know you can modify the TTL on the clients DNS cache but that shouldn't be necessary. The clients should query the secondary DNS when the primary doesn't respond. How long that takes I'm pretty sure is determined by some type of client side resolver algorithm.  

What happend to me was 5 minutes after I took my primary DNS server down I started getting calls from users saying they couldn't get to specific "Intranet" web pages or cname alias records for some of our SQL databases.  

I don't understand why that happened ???  Maybe the TTL on the Clients dns cache was the culprit. All I know is the minute I brought the other server backup everything started working ?  What's the point of having AD integrated zones if the fault tolerance doesn't work.    

Maybe I have something miconfigured ?
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
jrobisonAuthor Commented:
Doesn't IE have a default DNS TTL of 30 minutes or is that everyting prior to IE 7 ?  That could be the reason my users couldn't get to our internal web site.
0
 
Rich WeisslerProfessional Troublemaker^h^h^h^h^hshooterCommented:
Double checking --
First, the intranet web pages, and cname alias records were in the same zone as your Active Directory?

Second, have you had a chance to use NSLookup to confirm that the second DNS server is able to resolve the addresses if you point directly to it?

(And I apologize, it's a low probability answer, but might be worth checking...)  Confirm as well that the DHCP Server is handing out the correct IP address for your second DNS server.

The reason I ask, TTL doesn't really matter if you still have a DNS server that is authoritative for the zone, and if the zone is Active Directory-Integrated, your second DNS server should be authoritative... TTL only matters for records in another DNS server's cache.

(And, no... browsers don't keep a separate DNS cache.)
0
 
jrobisonAuthor Commented:
Razmus,

Yes the intranet web pages and the cname alias records are in the same zone. Yes nslookup resolves everything perfectly when you specify the second server and yes my DHCP scopes are configured with the correct IP settings and my users are receiving the correct information.

I hate to disagree with you, but IE does have a DNS cache timeout and the default setting is 30 minutes.  However, I'm not really concerned about that. I'd just like to know that if one of my DNS servers goes down and it happens to be the one that's set as the primary in my DHCP scope that my users will still be able to resolve everything and in a relatively short time frame.

0
 
Rich WeisslerProfessional Troublemaker^h^h^h^h^hshooterCommented:
I apologize.  You are correct... IE does has a separate DNS Cache.

Just to double check, you state "intranet web pages and the cname alias records are in the same zone."  I also wanted to double check that that zone was "the same zone as your Active Directory?"  I want to confirm that the zone that replicated as active directory - Integrated is the zone in which your intranet web pages and cname alias records are in.

In NSLookup, do a 'set norecurse', then do a search for your intranet pages... double check that it doesn't report back to you name servers or a 'Non-authoritative answer'.
0
 
druth2Commented:
If your workstations are on different vlans than your DNS servers, you could check to make sure that you have an ip-helper address set up for the second DNS server (Cisco switches anyway - not sure about others).  It's a vlan setting on the switch.
0
 
jrobisonAuthor Commented:
We just went through an onsite ADRAP and I discussed this with the tech. His resolution was to power down the primary DNS server and force the Secondary to take over the primary duties.
0
 
jrobisonAuthor Commented:
We went through an onsite AD heath check and this was the tech's suggested resolution.
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

  • 6
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now