Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Named (BIND) 'spontaneously' dying

Posted on 2004-09-28
16
Medium Priority
?
671 Views
Last Modified: 2012-06-21
Our named instance keeps dying, at seemingly random times.
We had this problem a while ago (maybe 2 months) so after some advice from use groups etc. I made some changes to zone configs.

That didn't work and it still died randomly, but then about maybe a day after those changes were made, it stayed alive.

Two months later, it's happening again.

THIS time tho, I can see something in messages that may be of help. That doesn't mean it wasn't there the first time, I'm very new to a LOT of things in linux, so it's VERY possible I overlooked them before. (The user groups were the ones that informed me of the EXISTANCE of messages for example!)

We have been getting around the problem by restarting named whenever it went down
(service named restart) - and I wrote a PERL script this time 'round to restart it every 30 mins through a cronjob. (it dies anywhere from every 10 mins to every couple of hours)

So -

We are running
- Redhat 7.3 (we can't upgrade redhat sorry)
- BIND 9.2.0

This is the part of messages that seems to show 'why' it's dying, but 'sif I can decipher it ;)

Sep 27 15:14:53 linux01 named[24776]: message.c:809: REQUIRE(*rdataset == ((void *)0)) failed
Sep 27 15:14:53 linux01 named[24776]: exiting (due to assertion failure)


This is what happens when we restart it (i think) after it dies

Sep 27 15:36:08 linux01 named: named shutdown failed
Sep 27 15:36:08 linux01 named[25206]: starting BIND 9.2.0 -u named
Sep 27 15:36:08 linux01 named[25206]: using 1 CPU
Sep 27 15:36:08 linux01 named[25209]: loading configuration from '/etc/named.conf'
Sep 27 15:36:08 linux01 named[25209]: no IPv6 interfaces found


I would appreciate ideas on how to permanently solve this problem, because it is a huge nuisance to us and can reduce productivity a lot.
It would also be of huge benefit if someone can explain what named/bind does in more detail (ie how it works) so I can have a better understanding of it and it's problems, and why it might be dying etc.  - that is, the more you can understand my understanding the better! May be worth extra points if I find it very useful.  :)

...
Apart from those lines from messages above, there were quite a few saying 'lame server' etc. but I've been assured they are harmless, at least in regards to our current problem.

Here is another excerpt. (I have replace the domains & stuff with ****)

Sep 27 14:59:27 linux01 named[24776]: zone *****1.com.au/IN: loading master file *****1.com.au: file not found
Sep 27 14:59:27 linux01 named[24776]: *****2.com.au:1: no TTL specified; using SOA MINTTL instead
Sep 27 14:59:27 linux01 named[24776]: zone *****2.com.au/IN: loaded serial 2001091501



Help please.  :|

Cheers,
Glauron
0
Comment
Question by:Glauron
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 3
  • 2
  • +2
16 Comments
 
LVL 40

Expert Comment

by:jlevie
ID: 12175679
"zone *****1.com.au/IN: loading master file *****1.com.au: file not found" and "*****2.com.au:1: no TTL specified; using SOA MINTTL instead" are certain clues that your named config files aren't right for a 9.2 copy of Bind. Without seeing all of the files involved I can't say whether "message.c:809: REQUIRE(*rdataset == ((void *)0)) failed" if a result of a configuration error or not.

You can post your namcd.conf file and all of the zone files here (unmodified) or send them as attachments to jlevie@experts-exchange.com and I'll look them over.
0
 
LVL 1

Author Comment

by:Glauron
ID: 12175756
emailed  =)
0
 
LVL 40

Expert Comment

by:jlevie
ID: 12175958
Got it, reply on the way.
0
Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

 
LVL 5

Expert Comment

by:napoleon41
ID: 12183812
Hey!  I understand the need for security and everything, but some of us are trying to learn here.  LOL

Could you at least post the part that ends up being the problem?  It's nice to sit in on discussions that I'm not very knowledgable about yet and glean a bit.  ;-)
0
 
LVL 40

Expert Comment

by:jlevie
ID: 12184040
I'll describe what the errors are.
0
 
LVL 1

Author Comment

by:Glauron
ID: 12185434
Yep. No problem with that. =)

I'll trust Jlevie with whatever he deems appropriate to display here. I know he won't go to far, if possible with this info.

Feel free to post whatever you think might help j.
0
 
LVL 1

Author Comment

by:Glauron
ID: 12750251
Yo - Jlevie.

=)

How'd u go with those zone files? - any luck?
It did it again, named dying - and has started working properly again.
0
 

Expert Comment

by:dsimco
ID: 12992512
I am having this same problem. What was your solution?
0
 
LVL 1

Author Comment

by:Glauron
ID: 12999059
Hey, glad to know I'm not the only one!  =)
But not glad to know that someone else is having this problem apart from that. =P

No solution yet,

       Jlevie, did you have any ideas?

--------

I got around the problem temporarily by setting a script to restart the named instance every 15 mins or so. Not at all perfect by any means, but when it worked, it saved restarting it manually.
Big pain tho.

Command I use to restart is:
service named restart

Everyone I have ever spoken to has no idea why this is happening. My guess is it has to be a bad zone, that somehow becomes fatal to named at certain times, maybe changes in the other server or something, who knows! I don't that's for sure.

Well, good luck;  and let me know if you get any ideas as well =)

- Glauron
0
 

Expert Comment

by:dsimco
ID: 12999180
Heya Glauron: I have started a new thread addressing this issue and have gotten somegood info. You should check it out.
http://www.experts-exchange.com/Networking/Linux_Networking/Q_21267519.html#12999162 

I believe we are on the right track.
0
 
LVL 1

Author Comment

by:Glauron
ID: 12999297
Yeah! lol - after answering that then, I found Jlevies profile thing & followed his answers, and came across that question! I thought, wow! That looks almost exactly like my question!

Then I noticed the date as today (actually yesterday from Oz =P )  & saw it was you!
=D

I tried Wesly's advice & the install went without any hiccups.

So now I play the waiting game =)

Well done, we might solve it after all!
0
 
LVL 1

Author Comment

by:Glauron
ID: 12999337
Anyone else following this post, please also view the question dsimco posted above, found at:

http://www.experts-exchange.com/Networking/Linux_Networking/Q_21267519.html

Has some very useful info pertaining to this issue.
Basically, it seems it is a problem with BIND, and needs to be updated, but also, the OS needs an update for security purposes.  ...

Glau
0
 
LVL 38

Accepted Solution

by:
wesly_chen earned 2000 total points
ID: 12999368
Hi,

   Since RedHat discontinues the support on RedHat 7.3 but you can still download the latest patches from:
http://download.fedoralegacy.org/redhat/7.3/updates/i386/

   Besides, you can use apt-get to automate the update process:
As root:
wget http://ftp.freshrpms.net/pub/freshrpms/redhat/7.3/apt/apt-0.5.5cnc5-fr0.rh73.2.i386.rpm
rpm -ivh apt-0.5.5cnc5-fr0.rh73.2.i386.rpm
apt-get dist-upgrade

   By the way, upgrade kernel doesn't mean upgrade OS to RH 9 or Fedora. The latest kernel for RH7.3 is:
http://download.fedoralegacy.org/redhat/7.3/updates/i386/kernel-2.4.20-37.7.legacy.i686.rpm
Well, kernel upgrade need to be reboot to load that kernel.

Regards,

Wesly
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have seen several blogs and forum entries elsewhere state that because NTFS volumes do not support linux ownership or permissions, they cannot be used for anonymous ftp upload through the vsftpd program.   IT can be done and here's how to get i…
Note: for this to work properly you need to use a Cross-Over network cable. 1. Connect both servers S1 and S2 on the second network slots respectively. Note that you can use the 1st slots but usually these would be occupied by the Service Provide…
If you're a developer or IT admin, you’re probably tasked with managing multiple websites, servers, applications, and levels of security on a daily basis. While this can be extremely time consuming, it can also be frustrating when systems aren't wor…
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question