DNS failover not working as expected query

Hi all,

Thanks in advance. I have been working on a really strange issue today and last night. Here's the run down. Basically we have two HPUX 11.i DNS servers lets call them 192.168.1.03 and 192.168.1.04. Each of our UNIX servers had the resolv.conf file configured as: -
search domain.local
namesserver 192.168.1.04
nameserver 192.168.1.03

We then had a major hardware failure for 192.168.1.04 and were not able to bring it back online. We were then seeing some DNS delays when trying to resolve hostnames but it was working to the secondary name server 192.168.1.03. In order to fix the delay we edited the resolv.conf file and changed the order of the name servers to be: -
search domain.local
namesserver 192.168.1.03
nameserver 192.168.1.04

This fixed the delay when doing nslookups but we still had issues with some applications including SAP. Our application servers were giving unknown host errors when trying to communicate with each other even though ping and nslookups from the OS were working without any problem.

We then commented out the 192.168.1.04 box from one of the server with the issue. This still did not help. Finally the only thing we could do to work around the issue was to reboot the server with the 192.168.1.04 line commented in the resolv.conf file and this fixed the issue. We were then later able to get the 192.168.1.04 DNS server back online and this resolved the issue on the remaining servers that had not been rebooted but had the resolv.conf file configured as follows: -

search domain.local
namesserver 192.168.1.03
nameserver 192.168.1.04

I guess what I am wondering is why the servers/applications seemed unable to failover to the secondary DNS server even though at the OS level the failover was working correctly. Does anyone have any thoughts/comments on how to configure this so that failover would work as we would like and prevent any outages in case of a single DNS failure?

Please accept my apologies for the length of this post and thanks for taking the time to read it.

PS. All local host files only contain the localhost and all the nsswitch.conf files were set to dns (noservercontinue) files.
LVL 29
mass2612Asked:
Who is Participating?
 
SysExpertCommented:
Unfortunately a  lot of applications cache the DNS of relevant servers adn do not flush or update them frequently.

Not a lot you can do except learn how to flush the local caching manually per application, or to shut down and restart the application it self.

I hope this helps !
0
 
peakpeakCommented:
Sounds like a cache issue. Does the server reread the resolve.conf file after it's been changed? Maybe tou only need to restart the DNS daemon, not the machine.
0
 
mass2612Author Commented:
That's the real question. I've read throughout the HP ITRC forums that you should simply be able to update the resolv.conf file and the file will be automatically re-read but it seems that its not. I'm not sure if its particular processes that read the file only when they start and therefore don't see the update or if there is something else I am missing.

0
Protect Your Employees from Wi-Fi Threats

As Wi-Fi growth and popularity continues to climb, not everyone understands the risks that come with connecting to public Wi-Fi or even offering Wi-Fi to employees, visitors and guests. Download the resource kit to make sure your safe wherever business takes you!

 
peakpeakCommented:
Do a test with your secondary DNS server, change the conf file and se if it responds to the new value, if not, restart the daemon and test again
0
 
mass2612Author Commented:
Yes testing from the OS level works straight away but the applications seem to not read the change until after a reboot. If I modify the resolv.conf and then run nslookup hostname the update to the resolv.conf file is seen right away but any running application or new aplication fails.
0
 
peakpeakCommented:
Well, nslookup is an application, albeit small. What applications are not working?
0
 
peakpeakCommented:
Yep, caching is at times good, at others it's not time saving as it was meant to be. Microsoft use a LOT of caching, most of it probably good but, like you, I've met cases where you need to "push a button"
0
 
mass2612Author Commented:
Hi both - I agree pretty much with what you say. I think the only way for me to have this failover correctly would be to place dns into a clustered package using the clustered IP on the clients so that the apps that cache the dns config at startup would not require a restart.

Thanks again.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.