We help IT Professionals succeed at work.

Need to stop a specific Nagios alert on Nagios 3.2.0 on RHEL 5.5

cbecker001
cbecker001 asked
on
A previous admin set up Nagios to provide a lot of alerting before he left.  It's all good and helpful.  However, one server neglected to respond to an email at a specific hour--04:01--and for eight days now Nagios keeps sending out Critical warnings that it hasn't heard back from that email.  It's heard from every email before and after that time, but it hasn't heard from that specific email.  So, we have no problem, we just need to tell Nagios to stop notifying us about that one stale alert but continue checking for future alerts.  Unfortunately, I don't know enough about Nagios to tell it that.  

We are running version 3.2.0 of Nagios on RHEL 5.5.
Comment
Watch Question

Top Expert 2009

Commented:
You can do 2 things

from nagios web site, just disable notifications for that host

or

go to

/usr/local/nagios/etc/objects

find out that host

then edit that host and delete that part of notifications ( delete the "define service " section for that notifications)

Author

Commented:
I want to continue monitoring that host, though.  I just don't want to hear about that single email event.  Will your suggestion continue the monitoring of that host?

And, also, how do I bring up the nagios web interface?
DevOps Engineer
Commented:
Acknowledge the Service as non-sticky acknowledgement.
"Sticky" acknowledgements - You can now designate host and service acknowledgements as being "sticky" or not. Sticky acknowledgements suppress notifications until a host or service fully recovers to an UP or OK state. Non-sticky acknowledgements only suppress notifications until a host or service changes state.
Acknowledging-in-Nagios.pdf
Top Expert 2009

Commented:
yes it will .

go to  /usr/local/nagios/etc/objects

find the host ,example : vpsserver.cfg


edit the file

find out which command checking the service

just comment out that command


you will see something like this


define service{

  use  local-service           ; Inherit default values from a template

  host_name      localhost

  service_description  check_memory


check_command check_memory

  }


just make this like bellow :


#define service{

  #use  local-service           ; Inherit default values from a template

  #host_name      localhost

#  service_description  check_memory

 #check_command check_memory

  }



Author

Commented:
kosarajudeepak:

Okay, I'm at a web page for the host in question and I selected "Acknowledge".  This opened a page with the following along the left-hand side:

Host Name:      
Service:      
Sticky Acknowledgement:      
Send Notification:      
Persistent Comment:      
Author (Your Name):      
Comment:

And you are saying I should UNCHECK the "Sticky Acknowledgement" and then click "Commit"?

Is that correct?
Top Expert 2009

Commented:
@kosarjudeepak

Sorry just wanted to know..

its not almost same as to click on the host as make its

Disable  active checks of this host
or
Disable  notifications for all services on this host

but, if i know that this service never will come up.. then why not just delete definitions from .cfg file ??

other wise.. it will show as BAD host with red color always!!!
will look awfull!!


Deepak KosarajuDevOps Engineer
Commented:
@fosiul01
As user said until service is going to recover he want to disable alerts. I think you missed this line:
" we just need to tell Nagios to stop notifying us about that one stale alert but continue checking for future alerts "
As he said he wants to know future alerts for the same service and stop the alerts for the present status its better to submit a non-sticky acknowledgement.

@cbecker001
Yes, uncheck and check persistent comment filed and add your comment to comment filed why you are acknowledging and say commit.

Author

Commented:
fosiul01, I don't think I have explained myself well.

The warning that comes up is for a nagios check that I WANT to happen.  I WANT this check to continue into the future.  However, the check discovered ONE failed reply eight days ago and notified me of it and I want to stop it from continuing to notify me.  That is all.  So, once this old notification stops, the checking that caused the notification will still be going on to notify me in case something happens in the future.

Author

Commented:
Okay, kosarajudeepak:, I did that.  I'll wait to see if I get any more alerts.
Deepak KosarajuDevOps Engineer

Commented:
no way its going to send alerts until its stage changes from stale which would probably be UNKNOWN State. If it changes to CRITICAL,WARNING,OK its going to start sending those alerts as there is a state change and we did a non-sticky acknowledgement. Gud luck.
Top Expert 2009

Commented:
hmmm ok

then what wrong with disable notification  as i said ???or Disable  active checks of this host ??

anyway.. what ever way you do.. if job is done thats the main thing



Deepak KosarajuDevOps Engineer

Commented:
@fosiul01
I mean disable notification/disable active check will completely disable the future notification and check until you come back and enable them. But acknowledging will just suppress for a particular status and when recovery its automatically start sending notification for any state change.

No Hard feelings it's matter of how we both understood the question. Take it easy. We are hear to just share our knowledge. Have a good weekend my dear friend.
Top Expert 2009

Commented:
LOL!!
no no you got me wrong. I never take anything wrong in EE!!!
its just , i was trying to understand  .. what was the main differences between acknowledge and disable that it


i will have a look that Pdf later on .

have a good weakend  

Author

Commented:
Thank you both.  kosarajudeepak did understand my problem correctly and his solution solved it properly.  However, I REALLY appreciate the prompt response from both of you.  I'm going to close the question now.

Thanks again!