ssbn628
asked on
Sendmail quits working when NS1 locks up
Hi,
We have had problems keeping our name server up and running (intel box..running Redhat 8) so our provider of our T1 lines is doing our dns. How can I take our name server offline without shutting down Sendmail? Sendmail is running on a Sun E-450 with Solaris 8 and even now if the name server locks up then Sendmail quits. I would like to be able to disconnect the bad Intel box but can't due to Sendmail quitting on the Sun Box. Is there a list of files on the Solaris unit I can edit with the new dns info and take the old out? This was configured by another person who is gone now,and we're left without knowing hardly anything about Solaris or Redhat.
We have had problems keeping our name server up and running (intel box..running Redhat 8) so our provider of our T1 lines is doing our dns. How can I take our name server offline without shutting down Sendmail? Sendmail is running on a Sun E-450 with Solaris 8 and even now if the name server locks up then Sendmail quits. I would like to be able to disconnect the bad Intel box but can't due to Sendmail quitting on the Sun Box. Is there a list of files on the Solaris unit I can edit with the new dns info and take the old out? This was configured by another person who is gone now,and we're left without knowing hardly anything about Solaris or Redhat.
Change /etc/resolv.conf on the Solaris box to point only to your ISP's DNS server and it will stop using the local name server.
ASKER
Thanks I will try that now!
ASKER
Well, I removed the name server pointing to 192.168.0.101 (the Intel box) and senamail quit working. Is there other files I need to edit?
When you removed "nameserver 192.168.0.101" did you replace that with "nameserver ISP-DNS-IP"? And after making the change did you check for proper operation by trying an "nslookup www.sun.com" or similar?
Since it sounds like this mail server is probably behind a firewall does the firewall permit DNS queries on ports 53/TCP & 53/UDP from the Solaris box?
Since it sounds like this mail server is probably behind a firewall does the firewall permit DNS queries on ports 53/TCP & 53/UDP from the Solaris box?
ASKER
I just took the ip numbwer out. Do I need to enter nameserver isp-dns-ip in place of the line I removed? Here is the original resolv.conf file:
nameserver 65.174.128.131
nameserver 65.174.128.212
nameserver 192.168.0.101
domain battleswireless.com
I took out the nameserver 192.168.0.101
nameserver 65.174.128.131
nameserver 65.174.128.212
nameserver 192.168.0.101
domain battleswireless.com
I took out the nameserver 192.168.0.101
With that data in resolv.conf simply removing "nameserver 192.168.0.101" will keep the Solaris box from attempting to use your Linux DNS server. Given the order of the nameservers in resolv.conf your Solaris box would have gone to 65.174.128.131 first and only tried 65.174.128.212 if the first name server was unavailable and only tried the Linux box if both other name servers failed to respond.
Now I'm wondering if your current problem is a result of the Linux DNS server having records that equated hostnames to private IP's. The external DNS servers would not have that data and the lack of it could be a problem for sendmail. With the Linux server in resolv.conf what do you get if you do an nslookup on the mail server's hostname and IP?
Now I'm wondering if your current problem is a result of the Linux DNS server having records that equated hostnames to private IP's. The external DNS servers would not have that data and the lack of it could be a problem for sendmail. With the Linux server in resolv.conf what do you get if you do an nslookup on the mail server's hostname and IP?
ASKER
Here is what I got.
Output of:
nslookup -q=A mail.battleswireless.com ns1.worldnet.att.net
Note: nslookup is deprecated and may be removed from future releases.
Consider using the `dig' or `host' programs instead. Run nslookup with
the `-sil[ent]' option to prevent this message from appearing.
*** Invalid option: q=A
Server: ns1.worldnet.att.net
Address: 204.127.129.1#53
Name: mail.battleswireless.com
Address: 63.165.126.153
Output of:
nslookup -q=A mail.battleswireless.com ns1.worldnet.att.net
Note: nslookup is deprecated and may be removed from future releases.
Consider using the `dig' or `host' programs instead. Run nslookup with
the `-sil[ent]' option to prevent this message from appearing.
*** Invalid option: q=A
Server: ns1.worldnet.att.net
Address: 204.127.129.1#53
Name: mail.battleswireless.com
Address: 63.165.126.153
ASKER
Here is a ns lookup from the Sun box:
# nslookup mail.battleswireless.com
Server: ns1.netlogic.net
Address: 65.174.128.131
Non-authoritative answer:
Name: mail.battleswireless.com
Address: 192.168.0.100
# nslookup mail.battleswireless.com
Server: ns1.netlogic.net
Address: 65.174.128.131
Non-authoritative answer:
Name: mail.battleswireless.com
Address: 192.168.0.100
The key to the prolem is what is shown at the bottom of your last comment. Your Linux DNS server says that mail.battleswireless.com has the IP 192.168.0.100 and I'll bet that an 'nslookup 192.168.0.100' is only responded to by the Linux DNS server and it will say that 192.168.0.100->mail.battle swireless. com. That explains why sendmail stops when you take down the Linux DNS server. In that case Sendmail "can't find itself" because the external DNS servers don't have data for any of the private networks.
This is a classic problem of having servers inside of private, NAT'ed networks. There can be two solutions. One is to run an DNS inside of the firewall that equates host names to private IP's and that forwards requests from inside clients for non-local hosts to an outside DNS server. The other is to ensure that all inside hosts are set up with Fully Qualified Domain Names, and that the hosts file on each inside machine has records equating the private IP's to hostnames.
On the Solaris box what does 'hostname' return? And what does /etc/hosts have for 192.168.0.100?
Ordinarily a RedHat system should be extremely reliable, if it is running on good hardware and has been properly maintained. What sort of problem is this system having?
This is a classic problem of having servers inside of private, NAT'ed networks. There can be two solutions. One is to run an DNS inside of the firewall that equates host names to private IP's and that forwards requests from inside clients for non-local hosts to an outside DNS server. The other is to ensure that all inside hosts are set up with Fully Qualified Domain Names, and that the hosts file on each inside machine has records equating the private IP's to hostnames.
On the Solaris box what does 'hostname' return? And what does /etc/hosts have for 192.168.0.100?
Ordinarily a RedHat system should be extremely reliable, if it is running on good hardware and has been properly maintained. What sort of problem is this system having?
ASKER
Hostname returns mail..the file I found I could read for hosts was hostname.hme0 (Ethernet card?) it has mail.battleswireless.com
What is in /etc/hosts for that hostname?
ASKER
This is the reply from cat hosts:
127.0.0.1 localhost
192.168.0.100 mail.battleswireless.com loghost
#
127.0.0.1 localhost
192.168.0.100 mail.battleswireless.com loghost
#
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Here is what is in the /etc/net/ticlts host file now:
#ident "@(#)hosts 1.2 92/07/14 SMI" /* SVr4.0 1.2 */
# RPC Hosts
mail mail
Do I just delete all that and put mail.battleswireless.com?
#ident "@(#)hosts 1.2 92/07/14 SMI" /* SVr4.0 1.2 */
# RPC Hosts
mail mail
Do I just delete all that and put mail.battleswireless.com?
Replace each "mail" with "mail.battleswireless.com" . You'll find that ticlts/hosts & ticots/hosts will be of the same form.
ASKER
Ok, I got all the steps done and after rebooting the Sun box now says hostname: mail.battleswireless.com as it should. I tried disconnecting the DNS server by removing the Ethernet cable from the server....Sendmail kept working and if you are already logged in it will deliver mail as it should...we then halted the DNS box and mail was delivered as well if you already had your email program (Outlook Express) up and running. But when we halted the DNS Box and closed out and reopened Outlook Express we couldn't find the Mail.battleswireless.com. Bummer!
I ran a nslookup from the Sun box and get the following:
Server: ns1.netlogic.net
Address: 65.174.128.131
Non-authoritative answer:
name: mail.battleswireless.com
address: 192.168.0.100
I ran a nslookup from the Sun box and get the following:
Server: ns1.netlogic.net
Address: 65.174.128.131
Non-authoritative answer:
name: mail.battleswireless.com
address: 192.168.0.100
The problem here is that your local DNS server is the only thing that can provide provide private IP's for the machines in your network in response to queries for those names. Until the Sun box was correctly configured Sendmail would fail to find its hostname when it started if the Linux DNS server was down. When Sendmail starts it will attempt a reverse lookup of the IP(s) it is listening on to find its hostname. With the Linux DNS running that worked, but with it down that failed because the external name servers don't have that data. After the Solaris config was corrected the reverse lookup worked because the data was in the hostname & hosts file.
Now the problem has simply moved to the client. It is failing because there's no DNS server available that equates mail.battleswireless.com to 192.168.0.100. That can be solved with local hosts file records on each client, but that will quickly become a mess if you have a number of machines on this internal network. The solution there is to run a local DNS server that has the private data, which is what you were doing with the Linux box.
Earlier you stated that the reason for changing name servers was due to problems with the Linux DNS server. What were those problems?
Now the problem has simply moved to the client. It is failing because there's no DNS server available that equates mail.battleswireless.com to 192.168.0.100. That can be solved with local hosts file records on each client, but that will quickly become a mess if you have a number of machines on this internal network. The solution there is to run a local DNS server that has the private data, which is what you were doing with the Linux box.
Earlier you stated that the reason for changing name servers was due to problems with the Linux DNS server. What were those problems?