Solved

service named start;p starts 5 named processes?

Posted on 2004-08-11
12
207 Views
Last Modified: 2010-04-20
Hi
I've noticed some strange behaviour on my WBL machine wrt named and sendmail. This mnorning I started to see lots of sendmail errors rejecting mail because the sender address could not be resolved. I then checked my /etc/resolv.conf and found that all the entries " domain xxx.xxx , namserver 127.0.0.1, nameserver my.sec.dns , nameserver some.other.dnsserver" have dissappeared, and only 127.0.0.1 remained.
When fixing that and restarting named, I saw it spawned 4 or 5 named's. Sendmail started to work again, but later, during the day, it again started to moan, this time when users on the server tried to send out, it said sender address does not exist.

I checked my /etc/named.conf, it seems to be ok, and restarting it leaves no errors in the /var/log/messages, only shows all the notifies sent...

Any ideas, thoughts?
0
Comment
Question by:psimation
  • 5
  • 4
  • 3
12 Comments
 
LVL 40

Accepted Solution

by:
jlevie earned 90 total points
ID: 11778455
If you are running a copy of named the only entries in resolv.conf should be:

search your.domain.tld
nameserver 127.0.0.1
nameserver 123.4.5.6      (only if you have a secondary server for your domain at 123.4.5.6)

You need to make sure that your named is properly configured and will answer queries and one way to do that is to explictly query the server with 'host a-host.a-domain.tld 127.0.0.1'. I'd suggest trying local hostnames within your domain and external hosts.
0
 
LVL 51

Assisted Solution

by:ahoffmann
ahoffmann earned 35 total points
ID: 11781171
> ..  it again started to moan
and what did it complain about?, can you please giv examples
0
 
LVL 17

Author Comment

by:psimation
ID: 11781418
HI ahoffman

well, sendmail basically just said it could not resolve the sender address (which is a local account that has entries in the named.conf and zone files the works, ie a properly configured virtual domain on localhost).

It's as if named "stopped" but there are no signs of named being stopped or anything...
0
Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

 
LVL 51

Expert Comment

by:ahoffmann
ID: 11781474
> ..  sendmail basically just said it could not resolve the sender address
you mean that sendmail cannot find a MX record for the sender address?
0
 
LVL 17

Author Comment

by:psimation
ID: 11781574
Hi ahoffmann

Sendmail logs does not show anything specific such as that, it only states "sender address not resolvable" whether that is because sendmail cannot find a valid MX or not, I'm not sure. However, as stated , for those "unresolvable" sender domains, they all have correctly configured DNS records. The same server that does sendmail, also does named, and all those local domains have been configured on the same server, and when I dig @127.0.0.1 xxx.xxx any it resolves...
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 11781601
can you do a
  dig -mx <senderdomain>
0
 
LVL 17

Author Comment

by:psimation
ID: 11782034
Yes, I can do it now, BUT, it most probably could NOT do it at the time sendmail gave the errors. I restarted sendmail and named when I first noticed the problems, and that seemed to fix it. Unfortunately, I still don't know why it happened in the first place, nor why named seems to be spawning more processes than the norm. This is what prompted me to believe that the problem might lie with named, yet, as I stated, I cannot see anything wrong with the configuration or any reason for the system to start 5 processes instead of just one? There are no errors logged when restarting named, so I don't know where to look for any probelms...

Also, the main problem for me is that this could happen at any time again for all i know, causing my sendmail to basically reject all e-mails, and my users getting on my mammaries... ;)

 
0
 
LVL 40

Expert Comment

by:jlevie
ID: 11786222
Do the multiple instances of named persist, or do they exit after a few minutes? Do you have one or more (how many?) secondaries?
0
 
LVL 17

Author Comment

by:psimation
ID: 11786575
Hi Jim
They persist. I only have one secondary, however, I have already started to think along those lines. The secondary still runs on RH7.0 using bind 8.??, while the box in question is the one I re-installed with WBL + all the latest updates. The zone files are identical to what they were, and so are the syntax used in named.conf, but i'm not sue if that can cause problems? If the primary cannot send notifies to the secondary, surely that will not cause the primary to "malfunction"?
Today I saw again at one stage that the "primary" was unable to lookup some domains , but that was while there were known network issues from the ISP with Intl Bandwidth. However, what happened was this, I did a "test" dig to an intl. site and the response was from 127.0.0.1 that the query timed out. After I restarted named, the same query worked, this time using the seconday nameserver to do the lookup since 127.0.0.1 could not find it. ( the secondary server is located on a different network that did not have a network problem at the time). It's as if the nameserver goes limbo or something and freezes for some reason and does not even try any further servers listed in resolv.conf...
0
 
LVL 40

Expert Comment

by:jlevie
ID: 11786886
> I did a "test" dig to an intl. site and the response was from 127.0.0.1 that the query timed out.

Okay, that's what should happen if your Internet link is having problems. The local named is responding, it just can't get any data from remote servers.

> After I restarted named, the same query worked, this time using the seconday nameserver to do the lookup since 127.0.0.1could not find it.

That indicates to me that after the restart the local copy of named is not responding to requests at all and dig tried the second name server listed in resolve.conf, which worked.

Using 'host', as I described earlier is a better test in that the specification of the name server IP will restrict the query to exactly that server.
0
 
LVL 17

Author Comment

by:psimation
ID: 11787771
Sorry, Jim, I missed you there, "Using 'host', as described earlier...", did you mean "search your.domain.tld"? If so, i did make that change, but in any event, I thought the reason for having more than one entry in resolv.conf is exactly for the reason of redundancy, ie, if the one cannot resolv, try the next in the list??? ( I admit, I don't know much about DNS...) , but you say the response was then actually correct in that the name stay unresolved even though an altrnate nameserver in the list would be able to resolv?
You don't mention anything about the difference in BIND versions, can I deduct that that would not play a role in this?
0
 
LVL 40

Expert Comment

by:jlevie
ID: 11788276
I'm talking about executing a 'host a-host.a-domain.tld 127.0.0.1' command on WBL to check the operation of the local name server that is supposed to be listening at 127.0.0.1. In a like manner 'host nother-host.a-domain 123.4.5.6' can be used to check to see if your secondary (at 123.4.5.6) is resolving. In each case the query will go to exactly and only the IP specified for the name server.

Listing two or more name servers in resolv.conf is a redundancy issue. If, an only if, the first name server does not respond at all (as in a timeout while connecting to the name server) the second will be tried, and so forth. With that in mind, take another look at my previous comment.

As far as the resolver libraries are concerned it shouldn't matter whether they are issuing the query to a Bind 8 or 9 server. However, one shouldn't try to use the zone files from a Bind 8 server "as-is' on a Bind 9 server. At the least the zone files all need a $TTL directive and the localhost definition needs to be changed. And of course the named.conf needs to be updated a bit.
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

rdate is a Linux command and the network time protocol for immediate date and time setup from another machine. The clocks are synchronized by entering rdate with the -s switch (command without switch just checks the time but does not set anything). …
Fine Tune your automatic Updates for Ubuntu / Debian
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question