Solved

Named (BIND) 'spontaneously' dying

Posted on 2004-09-28
16
657 Views
Last Modified: 2012-06-21
Our named instance keeps dying, at seemingly random times.
We had this problem a while ago (maybe 2 months) so after some advice from use groups etc. I made some changes to zone configs.

That didn't work and it still died randomly, but then about maybe a day after those changes were made, it stayed alive.

Two months later, it's happening again.

THIS time tho, I can see something in messages that may be of help. That doesn't mean it wasn't there the first time, I'm very new to a LOT of things in linux, so it's VERY possible I overlooked them before. (The user groups were the ones that informed me of the EXISTANCE of messages for example!)

We have been getting around the problem by restarting named whenever it went down
(service named restart) - and I wrote a PERL script this time 'round to restart it every 30 mins through a cronjob. (it dies anywhere from every 10 mins to every couple of hours)

So -

We are running
- Redhat 7.3 (we can't upgrade redhat sorry)
- BIND 9.2.0

This is the part of messages that seems to show 'why' it's dying, but 'sif I can decipher it ;)

Sep 27 15:14:53 linux01 named[24776]: message.c:809: REQUIRE(*rdataset == ((void *)0)) failed
Sep 27 15:14:53 linux01 named[24776]: exiting (due to assertion failure)


This is what happens when we restart it (i think) after it dies

Sep 27 15:36:08 linux01 named: named shutdown failed
Sep 27 15:36:08 linux01 named[25206]: starting BIND 9.2.0 -u named
Sep 27 15:36:08 linux01 named[25206]: using 1 CPU
Sep 27 15:36:08 linux01 named[25209]: loading configuration from '/etc/named.conf'
Sep 27 15:36:08 linux01 named[25209]: no IPv6 interfaces found


I would appreciate ideas on how to permanently solve this problem, because it is a huge nuisance to us and can reduce productivity a lot.
It would also be of huge benefit if someone can explain what named/bind does in more detail (ie how it works) so I can have a better understanding of it and it's problems, and why it might be dying etc.  - that is, the more you can understand my understanding the better! May be worth extra points if I find it very useful.  :)

...
Apart from those lines from messages above, there were quite a few saying 'lame server' etc. but I've been assured they are harmless, at least in regards to our current problem.

Here is another excerpt. (I have replace the domains & stuff with ****)

Sep 27 14:59:27 linux01 named[24776]: zone *****1.com.au/IN: loading master file *****1.com.au: file not found
Sep 27 14:59:27 linux01 named[24776]: *****2.com.au:1: no TTL specified; using SOA MINTTL instead
Sep 27 14:59:27 linux01 named[24776]: zone *****2.com.au/IN: loaded serial 2001091501



Help please.  :|

Cheers,
Glauron
0
Comment
Question by:Glauron
  • 6
  • 3
  • 2
  • +2
16 Comments
 
LVL 40

Expert Comment

by:jlevie
Comment Utility
"zone *****1.com.au/IN: loading master file *****1.com.au: file not found" and "*****2.com.au:1: no TTL specified; using SOA MINTTL instead" are certain clues that your named config files aren't right for a 9.2 copy of Bind. Without seeing all of the files involved I can't say whether "message.c:809: REQUIRE(*rdataset == ((void *)0)) failed" if a result of a configuration error or not.

You can post your namcd.conf file and all of the zone files here (unmodified) or send them as attachments to jlevie@experts-exchange.com and I'll look them over.
0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
emailed  =)
0
 
LVL 40

Expert Comment

by:jlevie
Comment Utility
Got it, reply on the way.
0
 
LVL 5

Expert Comment

by:napoleon41
Comment Utility
Hey!  I understand the need for security and everything, but some of us are trying to learn here.  LOL

Could you at least post the part that ends up being the problem?  It's nice to sit in on discussions that I'm not very knowledgable about yet and glean a bit.  ;-)
0
 
LVL 40

Expert Comment

by:jlevie
Comment Utility
I'll describe what the errors are.
0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
Yep. No problem with that. =)

I'll trust Jlevie with whatever he deems appropriate to display here. I know he won't go to far, if possible with this info.

Feel free to post whatever you think might help j.
0
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

 
LVL 1

Author Comment

by:Glauron
Comment Utility
Yo - Jlevie.

=)

How'd u go with those zone files? - any luck?
It did it again, named dying - and has started working properly again.
0
 

Expert Comment

by:dsimco
Comment Utility
I am having this same problem. What was your solution?
0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
Hey, glad to know I'm not the only one!  =)
But not glad to know that someone else is having this problem apart from that. =P

No solution yet,

       Jlevie, did you have any ideas?

--------

I got around the problem temporarily by setting a script to restart the named instance every 15 mins or so. Not at all perfect by any means, but when it worked, it saved restarting it manually.
Big pain tho.

Command I use to restart is:
service named restart

Everyone I have ever spoken to has no idea why this is happening. My guess is it has to be a bad zone, that somehow becomes fatal to named at certain times, maybe changes in the other server or something, who knows! I don't that's for sure.

Well, good luck;  and let me know if you get any ideas as well =)

- Glauron
0
 

Expert Comment

by:dsimco
Comment Utility
Heya Glauron: I have started a new thread addressing this issue and have gotten somegood info. You should check it out.
http://www.experts-exchange.com/Networking/Linux_Networking/Q_21267519.html#12999162

I believe we are on the right track.
0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
Yeah! lol - after answering that then, I found Jlevies profile thing & followed his answers, and came across that question! I thought, wow! That looks almost exactly like my question!

Then I noticed the date as today (actually yesterday from Oz =P )  & saw it was you!
=D

I tried Wesly's advice & the install went without any hiccups.

So now I play the waiting game =)

Well done, we might solve it after all!
0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
Anyone else following this post, please also view the question dsimco posted above, found at:

http://www.experts-exchange.com/Networking/Linux_Networking/Q_21267519.html

Has some very useful info pertaining to this issue.
Basically, it seems it is a problem with BIND, and needs to be updated, but also, the OS needs an update for security purposes.  ...

Glau
0
 
LVL 38

Accepted Solution

by:
wesly_chen earned 500 total points
Comment Utility
Hi,

   Since RedHat discontinues the support on RedHat 7.3 but you can still download the latest patches from:
http://download.fedoralegacy.org/redhat/7.3/updates/i386/

   Besides, you can use apt-get to automate the update process:
As root:
wget http://ftp.freshrpms.net/pub/freshrpms/redhat/7.3/apt/apt-0.5.5cnc5-fr0.rh73.2.i386.rpm
rpm -ivh apt-0.5.5cnc5-fr0.rh73.2.i386.rpm
apt-get dist-upgrade

   By the way, upgrade kernel doesn't mean upgrade OS to RH 9 or Fedora. The latest kernel for RH7.3 is:
http://download.fedoralegacy.org/redhat/7.3/updates/i386/kernel-2.4.20-37.7.legacy.i686.rpm
Well, kernel upgrade need to be reboot to load that kernel.

Regards,

Wesly
0

Featured Post

Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

Join & Write a Comment

I have seen several blogs and forum entries elsewhere state that because NTFS volumes do not support linux ownership or permissions, they cannot be used for anonymous ftp upload through the vsftpd program.   IT can be done and here's how to get i…
Note: for this to work properly you need to use a Cross-Over network cable. 1. Connect both servers S1 and S2 on the second network slots respectively. Note that you can use the 1st slots but usually these would be occupied by the Service Provide…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
This tutorial demonstrates a quick way of adding group price to multiple Magento products.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now