I have a Master DNS server running on Windows 2008 and a Slave DNS server running BIND on Linux.
The slave successfully collects the zone files from the master server, and can serve dns lookup requests fine.
However, recently the Master DNS server went down and I found that the slave also stopped working. When I run a test on the slave server using nslookup, it said
server can't find www.domain.com: SERVFAIL
The zone TTL is very low (2 minutes) to ensure we can switch to a backup server quite quickly. In the event that the Master server goes down, how can I ensure that the slave server keeps running, even when the zone files become old?
Here are the settings from windows
And a copy of the zone file from the slave server
$ORIGIN .$TTL 120 ; 2 minutesdomain.com IN SOA ns5.domain.com. hostmaster.domain.com. ( 2012012061 ; serial 120 ; refresh (2 minutes) 120 ; retry (2 minutes) 120 ; expire (2 minutes) 120 ; minimum (2 minutes) )$TTL 86400 ; 1 day NS ns5.domain.com. NS sip2.domain.com.$TTL 120 ; 2 minutes A **.222.**.254 MX 10 spam3.domain.net. MX 10 spam4.domain.net.$ORIGIN domain.com.autodiscover A **.46.**.17backupmx A **.222.**.224ftp A **.222.**.254mail A **.222.**.254my A **.222.**.254ns1 A **.222.**.254ns2 A **.222.**.51ns5 A **.222.**.254ns6 A **.222.**.51pop3 A **.222.**.254sip1 A **.222.**.126$TTL 86400 ; 1 daysip2 A **.250.**.36$TTL 120 ; 2 minutessipprovisioning A **.222.**.254sipserver A **.222.**.126smtp A **.222.**.254webmail A **.222.**.254webserver1 A **.222.**.254www A **.222.**.254
The goals are...
1) To keep TTL at 2 minutes to ensure we can make instant changes to our domains.
2) To ensure the slave server keeps zone files for at least 2 days when the primary server is offline.
Usually the lower the level the more frequent the hits on your server.
Refresh should be 3600, retry 600, expiry 30 days 25920000
DanJourno
ASKER
Arnold, how would those settings ensure that any changes are updated instantly?
We need a low TTL to ensure that any ip changes are relatively quick to ensure we can switch over to a backup application server when necessary.
Its Windows 2008 Web Edition. No AD.
DanJourno
ASKER
Papertrip, I'll read through that link and post any questions.
Thanks
Dan
Basically, SRVFAIL means server fail which is configurational error.
To check your named.conf file check the following command :
named-checkconf /etc/named.conf
Does it show any error?
I see your slave zone file has stopped updating (by checking serial number) which has still 2012012061 serial.
You can find out the cause and error by checking DNS log file.
However, I'm attaching /etc/named.conf file which could give you idea. named.conf.txt
arnold
Part of your configuration there should be a notify within the zone transfer tab, this will notify the slave that a change has occurred which will trigger a refresh on the slave.
You should do a per host TTL rather than for the entire zone.
i.e.
Refresh should be 3600, retry 600, expiry 30 days 25920000