Solved

how to make nagios to hit a url whenever a service goes down

Posted on 2011-02-27
13
447 Views
Last Modified: 2012-05-11
Hi Folks,

I have nagios monitoring in our environment.Currently we are are monitoring some http urls.
whenever the url returns blank page,nagios sends out alert. To reslove the issue we go to another url and hit enter.
what we want to is to automate this. I mean whenever nagios sends alert, we want nagios to invoke that particular url.
Please help me how this can be implemented. I am little familiar with nagios. Please guide me if any resource available .

Thank you,
Joe
0
Comment
Question by:jayatallen
  • 6
  • 5
  • 2
13 Comments
 
LVL 5

Expert Comment

by:rysic
ID: 34992864
What are you using to minitor http urls? Probably some nagios script, so you have to edit that script or write yours (which contain http checking scritp) whch firs check http and if result is wrong -> go to some url (may be via links) and return result to Nagios.
0
 

Author Comment

by:jayatallen
ID: 34993422
Thanks rysic for your reply.
I have pasted the service definition below and check command.

services.cfg has below entry for check_command:

check_command                   check_http_web!clroutweb1.svr.clearone.net!"http://www.company.com/us/home"!"clar lake"!20!10258

then, i checked my check_command file and found the below entries in it:
define command{
                command_name check_http_web
                command_line $USER1$/check_http -H $ARG1$ -u $ARG2$ -R $ARG3$ -t $ARG4$ -p $ARG5$

Seems like its using check_http plugin.

Could you please help me how can i implement my request in this scenario.
0
 
LVL 5

Expert Comment

by:rysic
ID: 34999962
You must write your own script which is using old script. For example in nagios config:

define command{
                command_name check_http_web_extra
                command_line $USER1$/check_http_extra -H $ARG1$ -u $ARG2$ -R $ARG3$ -t $ARG4$ -p $ARG5$

check_command                   check_http_web_extra!clroutweb1.svr.clearone.net!"http://

and write your script/plugin check_http_web_extra (in your favorite programing language) which starts old plugin check_http_web witch parameters -H $ARG1$ -u $ARG2$ -R $ARG3$ -t $ARG4$ -p $ARG5$ and then if result is OK -> returs result but if result is ERROR -> start url and then return result ERROR.

You can start page cia links. Something like:
links google.pl &

and after some sleep:

killall links

0
 
LVL 5

Expert Comment

by:rysic
ID: 34999999
There is check http manpage: http://nagiosplugins.org/man/check_http
0
 

Author Comment

by:jayatallen
ID: 35002738
Hi rysic,
thank you for your reply.

i have heard about event handler and i tried it but its not working.
what i have done so far is:
Adding below line in services.cfg for service we are monitoring,
event_handler                   clear_cache

Second, defined command is checkcommands.cfg
define command{
        command_name    clear_cache
        command_line    /apps/nagios/clr/libexec/clear_cache  $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$

and have a simple script which should run wget and invoke the mentioned url if the STATE is ok.

Script looks like:
#!/bin/bash
case "$1" in
OK)
echo -n "Clearing cache "
        wget  http://165.180.178.24:89654/uview?call=dg.imt.IMT
        ;;
WARNING)
        ;;
UNKNOWN)
        ;;
CRITICAL)

esac

exit 0

Restarted nagios.
To test it, i reshedule the service in nagios, but its unable to invoke wget.
I expect,if service is OK, it should invoke wget.

Is there any way to figure out how i can get this working?
Because, i can execute wget from command line and it works fine.
How can i get nagios event handler working?

Please help.


0
 
LVL 5

Expert Comment

by:rysic
ID: 35011790
I have never userd handler but it is better idea...

Your looks similar to that: http://nagios.sourceforge.net/docs/3_0/eventhandlers.html
and looks good... I'll do some tests in work tomowrrow.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 5

Expert Comment

by:rysic
ID: 35011803
Nagios has enought rights to do wget i script directory?
0
 

Author Comment

by:jayatallen
ID: 35012411
Yes, i have followed the above link.
I think nagios has enough rights to excute wget.
As i logged in as nagios user, and at command line i was able to execute
wget url

I also made sure that wget is working by looking at the access.log file.
Not sure why event handler is not able to invoke it.
If it works,would be really simple to implement for me.

0
 

Author Comment

by:jayatallen
ID: 35012422
i have also tested wget by running the script too.

$./clear_cache

and it worked fine. but event handler is unable to invoke it.
0
 
LVL 5

Expert Comment

by:group0
ID: 35012972
Does the event handler script have the appropriate executable permission set?  If the file is owned by the nagios user you need "chmod u+x /apps/nagios/clr/libexec/clear_cache", if file is owned by a different user you'll need "chmod o+x /apps/nagios/clr/libexec/clear_cache".

Also, is wget within the nagios user's path?  Login as the nagios user (or "su - nagios" if already logged in as root) and type "which wget".  If you don't get a result, it means it isn't and you'll need to add the path, or better yet, specify the full pathname in the script itself.

Try running the script directly as the nagios user:
/apps/nagios/clr/libexec/clear_cache OK

If you get the "Clearing cache" message, you should be good to go.
0
 

Author Comment

by:jayatallen
ID: 35014037
seems like i got it working with current settings.. i will  be  verifying  tomorrow.
Actually,to check my script, i was only excuting:
$./apps/nagios/clr/libexec/clear_cache OK

And this was creating a new file under /apps/nagios/clr/libexec/ folder.
So,for checking event handler,. i was looking for new files under /apps/nagios/clr/libexec/.
I had an impression that if event handler  executes clear_cache script, it would create a new file for every state change.
Dont know, for some reason ,when clear_cache script invoked by event_handler ,its not creating any file.

I only followed the link provided above for get it working.
0
 
LVL 5

Accepted Solution

by:
group0 earned 250 total points
ID: 35024796
The reason you're having a file created each time you run the script is because wget is saving the html response of the GET URL you are sending.  The output will be placed in the current working directory, regardless of whether you're running the wget inside of a script in a different path.  When Nagios is executing the event handler, it's probably outputting the file to the nagios user's home directory or /tmp.  The easiest way to verify that the script is being executed properly is to add syslogging to the script.  You also want to be triggering the cache flush when the service is going into a bad state, not when it's coming out of critical.  Here's an updated clear_cache script:

#!/bin/bash
case "$1" in
OK)
        ;;
WARNING)
        ;;
UNKNOWN)
        ;;
CRITICAL)
	case "$2" in
	SOFT)
		# wait until the third check before clearing the cache in case the critical state is a fluke
		case "$3" in
		3)
			logger "Nagios event handler triggered (SOFT 3/3) - clearing cache"
			wget -q "http://165.180.178.24:89654/uview?call=dg.imt.IMT" -O /dev/null
			;;
		esac
	;;
	HARD)
		# the wget in the soft 3/3 state above didn't solve the issue, try one more time 
		logger "Nagios event handler triggered (HARD) - clearing cache"
		wget -q "http://165.180.178.24:89654/uview?call=dg.imt.IMT" -O /dev/null
		;;
	esac
esac

exit 0

Open in new window


This will output the line "Nagios event handler triggered...." to syslog every time the handler script gets run, whether executed on the command line or by Nagios itself.  The output will probably end up in /var/log/messages, although this will depend on how your syslog is setup.

Also note the -q and -O options, this says to suppress the download time/throughput output (since you're running from a script) and to direct the final output to /dev/null (throw it away since you're only interested in triggering the cache flush).

Replace your existing clear_cache script with this, and test on the command line (preferably as the nagios user) with "clear_cache CRITICAL HARD" to make sure you get the expected log message.  Then wait for the next time a state change occurs.  If no log message is recorded, it means that Nagios is unable to execute the event handler script, most likely due to permissions (in that case, see my previous comment).

HTH
0
 

Author Comment

by:jayatallen
ID: 35063062
thanks a lot guys for your help
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Introduction This warning has to be one of the most commonly issued warnings in the history of PHP.  The article explains why this warning arises and what to do to mitigate the problem. How this Happens HTTP headers include many different kinds…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now