bkreynolds48
asked on
nagios - error when trying to check disk_backup
I don't know if backup is a reserved word in nagios or what the problem is.
All the other disk_whatever works fine but the one disk(file system mount) name backup errors our in nagios everytime. Is there a way to make this work?
Nagios in on a linux server the backup disk is SUN T2000 Solaris 64 bit unix
All the other disk_whatever works fine but the one disk(file system mount) name backup errors our in nagios everytime. Is there a way to make this work?
Nagios in on a linux server the backup disk is SUN T2000 Solaris 64 bit unix
What is this error?
Give us the command definition, plugin your are using default (or) custom, and the error message.
ASKER
This is in the minimal.cfg file.....
define service{
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_BACKUP
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_backup!10 0.0,20%!50 0.0,60%
}
THIS is the ERROR........
DISK_BACKUP CRITICAL 03-20-2009 13:01:30 0d 0h 0m 15s 1/4 NRPE: Command 'check_backup' not defined
define service{
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_BACKUP
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_backup!10
}
THIS is the ERROR........
DISK_BACKUP CRITICAL 03-20-2009 13:01:30 0d 0h 0m 15s 1/4 NRPE: Command 'check_backup' not defined
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
sorry change nrpe entires as below.
nrpe entries
command[check_disk]=/usr/lib/nagios/plugins/check_disk -p $ARG1$ -w $ARG2$ -c $ARG3$
ASKER
I have all the other disks defined the same way and that works fine in nagios........
/dev/dsk/c0t2d0s0 34G 3.7G 30G 11% /oradata1
/dev/dsk/c0t3d0s0 34G 4.9G 28G 15% /oradata5
/dev/dsk/c0t2d0s1 34G 3.9G 29G 12% /oradata2
/dev/dsk/c0t3d0s1 34G 1.2G 32G 4% /oradata6
/dev/dsk/c0t3d0s3 34G 1.2G 32G 4% /oradata7
/dev/dsk/c0t2d0s4 34G 16G 18G 48% /oradata4
/dev/dsk/c0t3d0s4 34G 5.1G 28G 16% /oradata8
/dev/dsk/c0t2d0s3 34G 14G 19G 43% /oradata3
/dev/md/dsk/d30 21G 4.0G 17G 19% /export/home
/dev/dsk/c2t5d0s6 295G 62G 230G 22% /backup
/dev/dsk/c0t2d0s0 34G 3.7G 30G 11% /oradata1
/dev/dsk/c0t3d0s0 34G 4.9G 28G 15% /oradata5
/dev/dsk/c0t2d0s1 34G 3.9G 29G 12% /oradata2
/dev/dsk/c0t3d0s1 34G 1.2G 32G 4% /oradata6
/dev/dsk/c0t3d0s3 34G 1.2G 32G 4% /oradata7
/dev/dsk/c0t2d0s4 34G 16G 18G 48% /oradata4
/dev/dsk/c0t3d0s4 34G 5.1G 28G 16% /oradata8
/dev/dsk/c0t2d0s3 34G 14G 19G 43% /oradata3
/dev/md/dsk/d30 21G 4.0G 17G 19% /export/home
/dev/dsk/c2t5d0s6 295G 62G 230G 22% /backup
Fine if that works the way it is then you have to define check_backup command in nrpe.cfg file. Can you copy the working service cfg and also entries corresponding in commands.cfg to the same working service. I mean which check_command is it referring too.
Regardless on parameters, it looks, that you don't have check_backup command defined in nrpe on your remote host (cnsdev). To be sure, post here your check_nrpe command definition from your nagios host, and check your nrpe.cfg on your remote host - it should contain line similar to this:
command[check_backup]=/som e/command/ here -with -some -options
command[check_backup]=/som
You are trying to monitor disk usage right? then how come CRITICAL threshold is hight compared to WARNING? for disk monitoring you get the value of free space left in % and MB
ASKER
This is in minimal.cfg
This is the same command that is used for the all the other disks.
# Command used to check disk space usage on local partitions
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
Here is one of the other disks that do work........
define service{
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_ORADATA1
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_oradata1! 100.0,20%! 500.0,60%
}
SO why does one work and one doesn't?
This is the same command that is used for the all the other disks.
# Command used to check disk space usage on local partitions
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
Here is one of the other disks that do work........
define service{
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_ORADATA1
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_oradata1!
}
SO why does one work and one doesn't?
ASKER
kosarajudeepak:
You are trying to monitor disk usage right? then how come CRITICAL threshold is hight compared to WARNING? for disk monitoring you get the value of free space left in % and MB
I am not a nagios expert - was just given to me to do so I just copied and pasted from other disks that were set up. What should it be?
You are trying to monitor disk usage right? then how come CRITICAL threshold is hight compared to WARNING? for disk monitoring you get the value of free space left in % and MB
I am not a nagios expert - was just given to me to do so I just copied and pasted from other disks that were set up. What should it be?
I like to look at the
command_name check_nrpe definition inside commands.cfg
command_name check_nrpe definition inside commands.cfg
ASKER
commands.cfg
# 'check_local_disk' command definition
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
I could not find check_nrpe in this file
# 'check_local_disk' command definition
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
I could not find check_nrpe in this file
The service definition and command definition you copied in your last post are not co-relating to each other. so we cannot help until your copy the check_nrpe command definition inside the command.cfg file.
You should have something like
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$
}
You should have something like
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$
}
ASKER
$: cd /usr/nagios/etc/objects
$: grep nrpe commands.cfg
There isn't any check_nrpe in the commands.cfg file
$: grep nrpe commands.cfg
There isn't any check_nrpe in the commands.cfg file
I think its totally mess setup, please do as below. Copy the following to your
# services.cfg
define service{
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_ORADATA1
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_disk!20%! 10%!/backu p
}
# commands.cfg
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$
}
#nrpe.cfg
command[check_disk]=/usr/l ib/nagios/ plugins/ch eck_disk -w $ARG2$ -c $ARG3$ -p $ARG4$
Above can be used for any filesystem monitoring just replace /backup with your specific monitoring by defining separate object definition for each service. so if filesystem reaches to 20% free space it alerts Warning and if it reaches to 10% freespace its CRITICAL.
you can even define individual % for each individual services they are not tunable as per your requirement, but commands.cfg and nrpe.cfg doesn't need to change for each service.
# services.cfg
define service{
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_ORADATA1
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_disk!20%!
}
# commands.cfg
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$
}
#nrpe.cfg
command[check_disk]=/usr/l
Above can be used for any filesystem monitoring just replace /backup with your specific monitoring by defining separate object definition for each service. so if filesystem reaches to 20% free space it alerts Warning and if it reaches to 10% freespace its CRITICAL.
you can even define individual % for each individual services they are not tunable as per your requirement, but commands.cfg and nrpe.cfg doesn't need to change for each service.
sorry missed ARG4 in command_line ARG4 we are passing is about filesystem to be monitored.
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$ $ARG4$
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$ $ARG4$
ASKER
this is in the nrpe.cfg file
# The following examples use hardcoded command arguments...
command[check_users]=/usr/ nagios/lib exec/check _users -w 5 -c 10
command[check_load]=/usr/n agios/libe xec/check_ load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/n agios/libe xec/check_ disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs ]=/usr/nag ios/libexe c/check_pr ocs -w 5 -c 10 -s Z
command[check_total_procs] =/usr/nagi os/libexec /check_pro cs -w 150 -c 200
# The following examples use hardcoded command arguments...
command[check_users]=/usr/
command[check_load]=/usr/n
command[check_hda1]=/usr/n
command[check_zombie_procs
command[check_total_procs]
please do as above suggestions , it looks complete mess with missing command and nrpe definitions.
ASKER
Do i just do a reload after making those changes?
s when ever we make change to nagios and nrpe reload is necessary. when ever you make changes to nrpe but not to nagios then only reload nrpe- vice versa.
As I wrote in my second comment - you don't have check_backup command defined in your nrpe.cfg file.
Add this to your nrpe.cfg:
command[check_backup]=/usr /nagios/li bexec/chec k_disk -w 60% -c 20%
Then, correct your minimal.cfg file, by changing this:
check_nrpe!check_backup!10 0.0,20%!50 0.0,60%
to this:
check_nrpe!check_backup
Add this to your nrpe.cfg:
command[check_backup]=/usr
Then, correct your minimal.cfg file, by changing this:
check_nrpe!check_backup!10
to this:
check_nrpe!check_backup
Oklit: but where are you passing the filesystem details? like /backup and path option for -p and user doesn't have check_nrpe command defined in commands.cfg.
I don't :) I missed one thing in check_backup line in nrpe.cfg
It should be:
command[check_backup]=/usr /nagios/li bexec/chec k_disk -w 60% -c 20% -p /backup
Thanks for pointing this.
It should be:
command[check_backup]=/usr
Thanks for pointing this.
User is also missing "check_nrpe!check_backup" check_nrpe command definition inside commands.cfg file so I reconstructed the complete steps for him in my above comments. since lot of missing specs around.
Well, I assume, that he has check_nrpe command defined, as he wrote, that he is using similar commands to monitor other disks. So he has to have check_nrpe defined. Just check_backup (on the remote host) is missing.
Also make sure you restart nrpe on the remote host otherwise any changes won't be recognised.
I thought the same that he should be having check_nrpe, but when i asked user to paste the command definition Following was user comment's
bkreynolds48:
$: cd /usr/nagios/etc/objects
$: grep nrpe commands.cfg
There isn't any check_nrpe in the commands.cfg file
He might be checking wrong folder which is not even called inside nagios.cfg
bkreynolds48:
$: cd /usr/nagios/etc/objects
$: grep nrpe commands.cfg
There isn't any check_nrpe in the commands.cfg file
He might be checking wrong folder which is not even called inside nagios.cfg
ASKER
kosarajudeepak,
I could not find a services.cfg file
I could not find a services.cfg file
If you have Nagios 3, you won't have a services.cfg file.
Do one thing can u copy the output of following search
grep -E "cfg_(dir|file)" /etc/nagios/nagios.cfg
grep -E "cfg_(dir|file)" /etc/nagios/nagios.cfg
ASKER
$ pwd
/usr/nagios/etc
$ ls -al *.cfg
-rw-rw-r-- 1 nagios nagios 10500 2008-03-17 09:53 cgi.cfg
-rw-r--r-- 1 root root 57021 2009-03-22 10:06 minimal.cfg
-rw-rw-r-- 1 nagios nagios 42302 2008-03-17 12:08 nagios.cfg
-rw-r--r-- 1 nagios nagios 7238 2009-03-20 13:56 nrpe.cfg
-rw-rw---- 1 nagios nagios 1326 2008-03-17 08:42 resource.cfg
$ grep -E "cfg_(dir|file)" /usr/nagios/etc/nagios.cfg
#cfg_file=/usr/nagios/etc/ objects/co mmands.cfg
#cfg_file=/usr/nagios/etc/ objects/co ntacts.cfg
#cfg_file=/usr/nagios/etc/ objects/ti meperiods. cfg
#cfg_file=/usr/nagios/etc/ objects/te mplates.cf g
cfg_file=/usr/nagios/etc/m inimal.cfg
#cfg_file=/usr/nagios/etc/ objects/lo calhost.cf g
#cfg_file=/usr/nagios/etc/ objects/wi ndows.cfg
#cfg_file=/usr/nagios/etc/ objects/sw itch.cfg
#cfg_file=/usr/nagios/etc/ objects/pr inter.cfg
# extension) in a particular directory by using the cfg_dir
#cfg_dir=/usr/nagios/etc/s ervers
#cfg_dir=/usr/nagios/etc/p rinters
#cfg_dir=/usr/nagios/etc/s witches
#cfg_dir=/usr/nagios/etc/r outers
# object configuration files (see the cfg_file and cfg_dir options above).
/usr/nagios/etc
$ ls -al *.cfg
-rw-rw-r-- 1 nagios nagios 10500 2008-03-17 09:53 cgi.cfg
-rw-r--r-- 1 root root 57021 2009-03-22 10:06 minimal.cfg
-rw-rw-r-- 1 nagios nagios 42302 2008-03-17 12:08 nagios.cfg
-rw-r--r-- 1 nagios nagios 7238 2009-03-20 13:56 nrpe.cfg
-rw-rw---- 1 nagios nagios 1326 2008-03-17 08:42 resource.cfg
$ grep -E "cfg_(dir|file)" /usr/nagios/etc/nagios.cfg
#cfg_file=/usr/nagios/etc/
#cfg_file=/usr/nagios/etc/
#cfg_file=/usr/nagios/etc/
#cfg_file=/usr/nagios/etc/
cfg_file=/usr/nagios/etc/m
#cfg_file=/usr/nagios/etc/
#cfg_file=/usr/nagios/etc/
#cfg_file=/usr/nagios/etc/
#cfg_file=/usr/nagios/etc/
# extension) in a particular directory by using the cfg_dir
#cfg_dir=/usr/nagios/etc/s
#cfg_dir=/usr/nagios/etc/p
#cfg_dir=/usr/nagios/etc/s
#cfg_dir=/usr/nagios/etc/r
# object configuration files (see the cfg_file and cfg_dir options above).
so except nrpe.cfg entries in my above post enter all others(services.cfg,comman ds,cfg) in to minimal.cfg file. nrpe.cfg entries add them on the client side and reload nrpe on client and nagios on server side.
ASKER
How do I reload nagios on the client side ? (SUN Solaris 10 )?
ps -ef |grep nrpe
nagios 409 1 0 Oct 03 ? 35:19 /usr/local/nagios/bin/nrpe -c /usr/local/etc/nrpe.cfg -d
ps -ef |grep nrpe
nagios 409 1 0 Oct 03 ? 35:19 /usr/local/nagios/bin/nrpe
ASKER
add them on the client side ???
All the nagios files on the client side are executables -- How do I add them there?
I figured out how to restart nagios on SUN - svcadm
All the nagios files on the client side are executables -- How do I add them there?
I figured out how to restart nagios on SUN - svcadm
You need not reload nagios on client side, when ever you make changes to nagios internal processing files then u have to reload nagios. But here is the little tip:
#which nagios
#/usr/local/nagios/bin/nagios -sv /usr/nagios/etc/nagios.cfg < to verify and test the nagios cfg settings >
#/etc/init.d/nagios reload
add them on the client side ???
All the nagios files on the client side are executables -- How do I add them there?
I mean
ps -ef |grep nrpe
nagios 409 1 0 Oct 03 ? 35:19 /usr/local/nagios/bin/nrpe -c /usr/local/etc/nrpe.cfg -d
Add nrpe.cfg entires in my above post to /usr/local/etc/nrpe.cfg on the client machine which you are trying to monitor and restart nrpe, nrpe doesn't have reload.
so use following process to kill and restart.
All the nagios files on the client side are executables -- How do I add them there?
I mean
ps -ef |grep nrpe
nagios 409 1 0 Oct 03 ? 35:19 /usr/local/nagios/bin/nrpe
Add nrpe.cfg entires in my above post to /usr/local/etc/nrpe.cfg on the client machine which you are trying to monitor and restart nrpe, nrpe doesn't have reload.
so use following process to kill and restart.
#pkill nrpe
#pgrep nrpe < should not get any results >
#/usr/local/nagios/bin/nrpe -c /usr/local/etc/nrpe.cfg -d
ASKER
$ /usr/bin/nagios -sv /usr/nagios/etc/nagios.cfg
Nagios 3.0
Copyright (c) 1999-2008 Ethan Galstad (http://www.nagios.org)
Last Modified: 03-13-2008
License: GPL
Reading configuration data...
Timing information on object configuration processing is listed
below. You can use this information to see if precaching your
object configuration would be useful.
Object Config Source: Config files (uncached)
OBJECT CONFIG PROCESSING TIMES (* = Potential for precache savings with -u option)
--------------------------
Read: 0.004566 sec
Resolve: 0.000066 sec *
Recomb Contactgroups: 0.000028 sec *
Recomb Hostgroups: 0.000017 sec *
Dup Services: 0.000155 sec *
Recomb Servicegroups: 0.000012 sec *
Duplicate: 0.000032 sec *
Inherit: 0.000023 sec *
Recomb Contacts: 0.000001 sec *
Sort: 0.000001 sec *
Register: 0.000408 sec
Free: 0.000042 sec
============
TOTAL: 0.005355 sec * = 0.000339 sec (6.33%) estimated savings
Running pre-flight check on configuration data...
Checking services...
Checked 55 services.
Checking hosts...
Checked 10 hosts.
Checking host groups...
Checked 3 host groups.
Checking service groups...
Checked 1 service groups.
Checking contacts...
Checked 5 contacts.
Checking contact groups...
Checked 5 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 7 host dependencies.
Checking commands...
Checked 12 commands.
Checking time periods...
Checked 2 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 2
Total Errors: 0
Timing information on configuration verification is listed below.
CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option)
--------------------------
Object Relationships: 0.000592 sec
Circular Paths: 0.000016 sec *
Misc: 0.000241 sec
============
TOTAL: 0.000849 sec * = 0.000016 sec (1.9%) estimated savings
Things look okay - No serious problems were detected during the pre-flight check
ASKER
Add nrpe.cfg entires in my above post to /usr/local/etc/nrpe.cfg on the client machine which you are trying to monitor and restart nrpe, nrpe doesn't have reload.
so use following process to kill and restart.
On the client side no nagios.cfg file exists only executable files............
bin/ libexec/ nagios-nrpe*
The nagios-nrpe file has.........
#!/bin/sh
# DESC: SMF method definitions/wrapper for NRPE.
# VERSION: $id$
#
# Distributed under the BSD License.
#
# Copyright (c) 2007 DigiTar
# All Rights Reserved
#
# Redistribution and use in source and binary forms, with or without modification,
# are permitted provided that the following conditions are met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# * Neither the name of DigiTar nor the names of its contributors may be
# used to endorse or promote products derived from this software without
# specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
# SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
# TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
# BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
# IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
# /usr/local/nagios/bin/nrpe -c /usr/local/etc/nrpe.cfg --daemon
. /lib/svc/share/smf_include .sh
PREFIX=/usr/local
ETC=${PREFIX}/etc
BIN=${PREFIX}/nagios/bin
PID_FILE=/var/run/nrpe.pid
nrpe_daemon="$BIN/nrpe -c $ETC/nrpe.cfg -d"
[ -f "$BIN/nrpe" ] || exit $SMF_EXIT_ERR_FATAL
case "$1" in
start)
$nrpe_daemon
if test "$?" = "0"
then
exit $SMF_EXIT_OK
elif test "$?" = "2"
then
echo "$ETC/nrpe.cfg does not exist. NRPE cannot be started without a configuration file."
exit $SMF_EXIT_ERR_FATAL
else
echo "An unexpected error occured while trying to start NRPE."
exit $SMF_EXIT_ERR_FATAL
fi
;;
refresh)
kill -HUP `cat $PID_FILE` # Send a SIGHUP to the NRPE process
if test "$?" = "0"
then
echo "NRPE configuration succesfully reloaded."
exit $SMF_EXIT_OK
else
echo "An unexpected error occured while instructing NRPE to reload its configuration. Is the NRPE process running?"
exit $SMF_EXIT_ERR_FATAL
fi
;;
esac
exit $SMF_EXIT_OK
# pwd
/usr/local/nagios/bin
# ls
nrpe* nrpe.old*
root@# file nrpe
nrpe: ELF 32-bit MSB executable SPARC Version 1, dynamically linked, not stripped
#pwd
/usr/local/nagios/libexec
# ls
check_apt* check_file_age* check_mailq* check_pgsql* check_time*
check_breeze* check_flexlm* check_mrtg* check_ping* check_udp@
check_by_ssh* check_ftp@ check_mrtgtraf* check_pop@ check_ups*
check_clamd@ check_http* check_nagios* check_procs* check_users*
check_cluster* check_icmp* check_nntp@ check_real* check_wave*
check_dhcp* check_ifoperstatus* check_nrpe* check_rpc* negate*
check_dig* check_ifstatus* check_nt* check_sensors* urlize*
check_disk* check_imap@ check_ntp* check_smtp* utils.pm*
check_disk_smb* check_ircd* check_nwstat* check_ssh* utils.sh*
check_dns* check_load* check_oracle* check_swap*
check_dummy* check_log* check_overcr* check_tcp*
I put int the minimal.cfg file on the nagios server the entry you have above, did a reload. but still it does not recognize check_backup as a valid command.
I must have missed something somewhere.
so use following process to kill and restart.
On the client side no nagios.cfg file exists only executable files............
bin/ libexec/ nagios-nrpe*
The nagios-nrpe file has.........
#!/bin/sh
# DESC: SMF method definitions/wrapper for NRPE.
# VERSION: $id$
#
# Distributed under the BSD License.
#
# Copyright (c) 2007 DigiTar
# All Rights Reserved
#
# Redistribution and use in source and binary forms, with or without modification,
# are permitted provided that the following conditions are met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# * Neither the name of DigiTar nor the names of its contributors may be
# used to endorse or promote products derived from this software without
# specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
# SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
# TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
# BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
# IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
# /usr/local/nagios/bin/nrpe
. /lib/svc/share/smf_include
PREFIX=/usr/local
ETC=${PREFIX}/etc
BIN=${PREFIX}/nagios/bin
PID_FILE=/var/run/nrpe.pid
nrpe_daemon="$BIN/nrpe -c $ETC/nrpe.cfg -d"
[ -f "$BIN/nrpe" ] || exit $SMF_EXIT_ERR_FATAL
case "$1" in
start)
$nrpe_daemon
if test "$?" = "0"
then
exit $SMF_EXIT_OK
elif test "$?" = "2"
then
echo "$ETC/nrpe.cfg does not exist. NRPE cannot be started without a configuration file."
exit $SMF_EXIT_ERR_FATAL
else
echo "An unexpected error occured while trying to start NRPE."
exit $SMF_EXIT_ERR_FATAL
fi
;;
refresh)
kill -HUP `cat $PID_FILE` # Send a SIGHUP to the NRPE process
if test "$?" = "0"
then
echo "NRPE configuration succesfully reloaded."
exit $SMF_EXIT_OK
else
echo "An unexpected error occured while instructing NRPE to reload its configuration. Is the NRPE process running?"
exit $SMF_EXIT_ERR_FATAL
fi
;;
esac
exit $SMF_EXIT_OK
# pwd
/usr/local/nagios/bin
# ls
nrpe* nrpe.old*
root@# file nrpe
nrpe: ELF 32-bit MSB executable SPARC Version 1, dynamically linked, not stripped
#pwd
/usr/local/nagios/libexec
# ls
check_apt* check_file_age* check_mailq* check_pgsql* check_time*
check_breeze* check_flexlm* check_mrtg* check_ping* check_udp@
check_by_ssh* check_ftp@ check_mrtgtraf* check_pop@ check_ups*
check_clamd@ check_http* check_nagios* check_procs* check_users*
check_cluster* check_icmp* check_nntp@ check_real* check_wave*
check_dhcp* check_ifoperstatus* check_nrpe* check_rpc* negate*
check_dig* check_ifstatus* check_nt* check_sensors* urlize*
check_disk* check_imap@ check_ntp* check_smtp* utils.pm*
check_disk_smb* check_ircd* check_nwstat* check_ssh* utils.sh*
check_dns* check_load* check_oracle* check_swap*
check_dummy* check_log* check_overcr* check_tcp*
I put int the minimal.cfg file on the nagios server the entry you have above, did a reload. but still it does not recognize check_backup as a valid command.
I must have missed something somewhere.
ASKER
in minimal.cfg on nagios server side.....
define service{
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_BACKUP
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_backup
}
define service{
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_BACKUP
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_backup
}
ASKER
Current Status: CRITICAL (for 2d 18h 19m 18s)
Status Information: NRPE: Command 'check_backup' not defined
Status Information: NRPE: Command 'check_backup' not defined
Good! now try reloading the nagios on server side and nrpe on client side. And make a check from servers through command lines as
{Path to nagios plugin changes as per your installation requirements here I have all my plugins under /usr/lib/nagios/plugins" if your are not what your plugin directory id check resource.cfg file under /usr/nagios/etc/resource.c fg
{Path to nagios plugin changes as per your installation requirements here I have all my plugins under /usr/lib/nagios/plugins" if your are not what your plugin directory id check resource.cfg file under /usr/nagios/etc/resource.c
#/usr/lib/nagios/plugins/check_nrpe -H <client server ip> -c check_disk -a 20% 10% /backup
Why are you calling again check_nrpe!check_backup this is not what I posted above to be copied to services.cfg - minimal.cfg. Here is the clear picture for you
##### SERVER SIDE ###
# copy and paste this to minimal.cfg #
## Added from Experts Exchange ##
define service{
name Added from Experts Exchange
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_ORADATA1
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_disk!20%! 10%!/backu p
}
# copy and pate below lines minimal.cfg #
## Added from Experts Exchange ##
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$
}
Reload Nagios on the Server
#### CLIENT SIDE ####
Restart nrpe daemon on Clien side.
# copy and paste below lines to nrpe.cfg #
## Added from Experts Exchange ##
command[check_disk]=/usr/l ib/nagios/ plugins/ch eck_disk -w $ARG2$ -c $ARG3$ -p $ARG4$
##### SERVER SIDE ###
# copy and paste this to minimal.cfg #
## Added from Experts Exchange ##
define service{
name Added from Experts Exchange
use generic-service; Name of service template to use
host_name cnsdev
service_description DISK_ORADATA1
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups dbadmins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_nrpe!check_disk!20%!
}
# copy and pate below lines minimal.cfg #
## Added from Experts Exchange ##
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$
}
Reload Nagios on the Server
#### CLIENT SIDE ####
Restart nrpe daemon on Clien side.
# copy and paste below lines to nrpe.cfg #
## Added from Experts Exchange ##
command[check_disk]=/usr/l
ASKER
/usr/nagios/libexec/check_ nrpe -H (IPaddress) -c check_disk -a 20% 10% /backu>
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
This is what I got from running your command - does that mean it worked?
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
This is what I got from running your command - does that mean it worked?
Restart nrpe on client side after adding the nrpe entries not before.
/usr/nagios/libexec/check_ nrpe -H (IPaddress) -c check_disk -a 20% 10% /backu> (what is this > at the end?)
Add the nrpe.cfg entries posted above to /usr/local/etc/nrpe.cfg and then restart nrpe daemon on client side using
#pkill nrpe
#pgrep nrpe < should not get any results >
#/usr/local/nagios/bin/nrp e -c /usr/local/etc/nrpe.cfg -d
and then check using following command from server.
Add the nrpe.cfg entries posted above to /usr/local/etc/nrpe.cfg and then restart nrpe daemon on client side using
#pkill nrpe
#pgrep nrpe < should not get any results >
#/usr/local/nagios/bin/nrp
and then check using following command from server.
#/usr/nagios/libexec/check_nrpe -H (IPaddress) -c check_disk -a 20% 10% /backup
ASKER
I killed it on the client side - it always restarts itself when killed - did a kill -9
Does pkill do something different?
Does pkill do something different?
ASKER
I did the pkill with the same results and reran your command from the nagios server with the same results...........
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
I ran it on /var and got the same results - which does not show an error on disk var in the nagios browser page
So I don't understand why disk backup can't be found.
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
I ran it on /var and got the same results - which does not show an error on disk var in the nagios browser page
So I don't understand why disk backup can't be found.
ASKER
/usr/nagios/libexec/check_ nrpe -H (IPaddress) -c check_disk -a 20% 10% /backu> (what is this > at the end?)
SORRY - the line wrapped when I did a copy/paste forgot to take that out
/usr/nagios/libexec/check_ nrpe -H (IPaddress) -c check_disk -a 20% 10% /backup
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
SORRY - the line wrapped when I did a copy/paste forgot to take that out
/usr/nagios/libexec/check_
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
ASKER
I just added in disk checks for oradata 4 through 8 and those work fine.
DISK_BACKUP CRITICAL 03-23-2009 09:20:33 0d 0h 3m 12s 4/4 NRPE: Command 'check_backup' not defined
DISK_ORADATA1 OK 03-23-2009 09:18:38 0d 8h 42m 7s 1/4 DISK OK - free space: /oradata1 30376 MB (89% inode=99%):
DISK_ORADATA2 OK 03-23-2009 09:19:26 0d 8h 46m 19s 1/4 DISK OK - free space: /oradata2 30088 MB (88% inode=99%):
DISK_ORADATA3 OK 03-23-2009 09:20:14 0d 8h 45m 31s 1/4 DISK OK - free space: /oradata3 19571 MB (57% inode=99%):
DISK_ORADATA4 OK 03-23-2009 09:20:36 0d 0h 9m 27s 1/4 DISK OK - free space: /oradata4 18063 MB (52% inode=99%):
DISK_ORADATA5 OK 03-23-2009 09:17:07 0d 0h 8m 38s 1/4 DISK OK - free space: /oradata5 29088 MB (85% inode=99%):
DISK_ORADATA6 OK 03-23-2009 09:17:56 0d 0h 7m 49s 1/4 DISK OK - free space: /oradata6 32889 MB (96% inode=99%):
DISK_ORADATA7 OK 03-23-2009 09:18:45 0d 0h 7m 0s 1/4 DISK OK - free space: /oradata7 32889 MB (96% inode=99%):
DISK_ORADATA8 OK 03-23-2009 09:19:33 0d 0h 6m 12s 1/4 DISK OK - free space: /oradata8 28869 MB (84% inode=99%):
DISK_BACKUP CRITICAL 03-23-2009 09:20:33 0d 0h 3m 12s 4/4 NRPE: Command 'check_backup' not defined
DISK_ORADATA1 OK 03-23-2009 09:18:38 0d 8h 42m 7s 1/4 DISK OK - free space: /oradata1 30376 MB (89% inode=99%):
DISK_ORADATA2 OK 03-23-2009 09:19:26 0d 8h 46m 19s 1/4 DISK OK - free space: /oradata2 30088 MB (88% inode=99%):
DISK_ORADATA3 OK 03-23-2009 09:20:14 0d 8h 45m 31s 1/4 DISK OK - free space: /oradata3 19571 MB (57% inode=99%):
DISK_ORADATA4 OK 03-23-2009 09:20:36 0d 0h 9m 27s 1/4 DISK OK - free space: /oradata4 18063 MB (52% inode=99%):
DISK_ORADATA5 OK 03-23-2009 09:17:07 0d 0h 8m 38s 1/4 DISK OK - free space: /oradata5 29088 MB (85% inode=99%):
DISK_ORADATA6 OK 03-23-2009 09:17:56 0d 0h 7m 49s 1/4 DISK OK - free space: /oradata6 32889 MB (96% inode=99%):
DISK_ORADATA7 OK 03-23-2009 09:18:45 0d 0h 7m 0s 1/4 DISK OK - free space: /oradata7 32889 MB (96% inode=99%):
DISK_ORADATA8 OK 03-23-2009 09:19:33 0d 0h 6m 12s 1/4 DISK OK - free space: /oradata8 28869 MB (84% inode=99%):