• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 917
  • Last Modified:

CentOS: monitor services

Hi All,

I'm running a CentOS server and want to ensure the following services are running all the time.

mysqld
named
postfix

What is the best way to monitor them, restart them/reboot the server if they stop running and email me?
0
detox1978
Asked:
detox1978
  • 18
  • 16
  • 11
  • +1
2 Solutions
 
sweetfa2Commented:
nagios


Otherwise you can have cron job that runs every minute to check the status of them and do the emailing that way.
0
 
farzanjCommented:
You can write a simple script that would check the services every minute or you can use some monitoring tool that would do everything for you but would be  harder to implement.  If you have just one server, it makes sense just to have a small script.  If you have many servers, you can consider some monitoring tool like Nagios or something much simpler like Xymon

http://www.xymon.com/xymon/help/about.html
0
 
sweetfa2Commented:
#!/bin/bash
#
#  This Nagios plugin was created to check the status of a service
#

PROGNAME=`basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION="1.0.0"

. $PROGPATH/utils.sh

usage()
{
        echo "Usage ${PROGNAME} service"
        exit $STATE_UNKNOWN
}

if [ $# -ne 1 ];
then
        usage
fi
service=$1

status=`sudo -u root /sbin/service $service status  2>>/tmp/errors | sed -n '$p' | sed 's/^.*\W//'`

case $status in
        running)
                echo "OK : Service is running"
                exit $STATE_OK
                ;;
        unused)
                echo "WARNING : Service is unused"
                exit $STATE_WARNING
                ;;
        dead)
                echo "CRITICAL : Service is dead"
                exit $STATE_CRITICAL
                ;;
        *)
                echo "Unknown: Service is $status"
                exit $STATE_UNKNOWN
                ;;
esac

Open in new window

0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
sweetfa2Commented:
The script above works in Nagios.  It is simple enough to modify it to work straight out of cron.
0
 
Kerem ERSOYPresidentCommented:
nagios or similar program will be your answer. Nagios just checks various services over plugins. It has plugins for checking MySQL, Postfix and named. Then it will update a a webserver so that you can monitor the service status over the web. It can also notify you if some of the services are stopped and when they restore them about the service been restored.
0
 
detox1978Author Commented:
I'd prefer not to install addition software.

Can someone help me write a cron script that runs the following commands;

service mysqld status
service named status
service postfix status

searching for the word running, and if it's not found restart the service and email an email?
0
 
Kerem ERSOYPresidentCommented:
Hi,

This script could be run from within a cron tast. It will not print anything if all services are running but it will send the stopped service name in an email  to the crion task runner.

#!/bin/bash
for i in  "mysqld" "named" "postfix"; do
  /sbin/service $i status | grep "stop"
done

Open in new window


Cheers,
K.
0
 
farzanjCommented:
Try this script:

#!/bin/bash

SERVICES="mysqld named postfix"
EMAIL_ADDR="admin@mydomain.com second@yahoo.com"
FAILED_SERVICES=""

for service in $SERVICES
do
   if (( $(netstat -npl | grep -c $service) == 0 ))
   then
        FAILED_SERVICES=$FAILED_SERVICES" $services"
    fi
done

if (( $#FAILED_SERVICES !=0 ))
then
     echo "Services failed: $FAILED_SERVICES" | mail -s "Services NOT running" $EMAIL_ADDR
fi

Open in new window

0
 
farzanjCommented:
Sorry, there's a typo in line 11, it should be
   FAILED_SERVICES=$FAILED_SERVICES" $service"
0
 
detox1978Author Commented:
That is what i'm looking for farzanj.

What do i save it as and how do I get it to auto restart them?
0
 
farzanjCommented:
Here is the modified version:

Save it as /root/monitor.sh

 
#!/bin/bash

SERVICES="mysqld named postfix"
EMAIL_ADDR="admin@mydomain.com second@yahoo.com"
FAILED_SERVICES=""

#Finding services that are not running
for service in $SERVICES
do
   if (( $(netstat -npl | grep -c $service) == 0 ))
   then
        FAILED_SERVICES=$FAILED_SERVICES" $service"
        #Attempt to restart the service
        service $service restart
    fi
done

#Emailing about failure
if (( $#FAILED_SERVICES !=0 ))
then
     echo "Services failed: $FAILED_SERVICES" | mail -s "Services NOT running" $EMAIL_ADDR
fi

Open in new window


Then enable in crontab as root

crontab -e
 
*/2 * * * * /root/monitor.sh

Open in new window


You can keep the location of the script as you deem reasonable.


Do you need any further modifications?
0
 
Kerem ERSOYPresidentCommented:
How about this?

#!/bin/bash
RECIPIENTS="rec1@domain.com rec2@domain.com"

for i in  "mysqld" "named" "postfix"; do
  if /sbin/service $i status 2>/dev/null | grep "stop" > /dev/null
  then
     echo -n $1 stopped attemping restart:
     if /sbin/service $i start 2>&1 > /dev/null
     then
        echo $i restarted
     else
        echo $i could not restart!!
     fi
  fi
done > /tmp/mail

( echo "Problems"; cat /tmp/mail ) | mail -s "Service problems detected !!!" $RECIPIENTS

Open in new window

0
 
detox1978Author Commented:
For some reason it keeps restarting postfix and also returns an error;


Shutting down postfix:                                     [  OK  ]
Starting postfix:                                          [  OK  ]
/root/monitorservices.sh: line 19: ((: 0FAILED_SERVICES: value too great for base (error token is "0FAILED_SERVICES")
0
 
farzanjCommented:
Ok.

In line 19, please change (( )) with [[ ]]

So you should have

if [[ $#FAILED_SERVICES !=0 ]]

Second, please show me the output of command

netstat -antpl | grep 25
0
 
detox1978Author Commented:
same issue;

#netstat -antpl | grep 25
tcp        0      0 0.0.0.0:25                  0.0.0.0:*                   LISTEN      29728/master


KeremE, yours sends an email every time.  I need it to only email when a service is not running.
0
 
detox1978Author Commented:
I think it restarts the postfix service because it is called master in the services list
0
 
farzanjCommented:
Sorry about that.  Didn't realize that postfix doesn't return its name.  Try this one.  Sorry for the inconvenience.
#!/bin/bash

SERVICES="mysqld named postfix"
EMAIL_ADDR="admin@mydomain.com second@yahoo.com"
FAILED_SERVICES=""

#Finding services that are not running
for service in $SERVICES
do
   if (( $(service $service status | grep -c running) == 0 ))
   then
        FAILED_SERVICES=$FAILED_SERVICES" $service"
        #Attempt to restart the service
        service $service restart
    fi
done

#Emailing about failure
if [[ ${#FAILED_SERVICES} !=0 ]]
then
     echo "Services failed: $FAILED_SERVICES" | mail -s "Services NOT running" $EMAIL_ADDR
fi

Open in new window

0
 
detox1978Author Commented:
thanks that has fixed the postfix restarting.  But it is still erroring on line 19.


/root/monitorservices.sh: line 19: conditional binary operator expected
/root/monitorservices.sh: line 19: syntax error near `!=0'
/root/monitorservices.sh: line 19: `if [[ ${#FAILED_SERVICES} !=0 ]]'
0
 
detox1978Author Commented:
I've moved where i sends the email and it works great.  thanks. :-)
#!/bin/bash

SERVICES="mysqld named postfix"
EMAIL_ADDR="admin@mydomain.com second@yahoo.com"
FAILED_SERVICES=""

#Finding services that are not running
for service in $SERVICES
do
   if (( $(service $service status | grep -c running) == 0 ))
   then
        FAILED_SERVICES=$FAILED_SERVICES" $service"
        #Attempt to restart the service
        service $service restart
        echo "Services failed: $FAILED_SERVICES" | mail -s "Services NOT running" $EMAIL_ADDR
    fi
done

Open in new window

0
 
farzanjCommented:
Welcome :)  Glad it worked for you.
0
 
farzanjCommented:
Sorry again.  The reason it was failing because it should have been
if [[ ${#FAILED_SERVICES} != 0 ]]

instead of
if [[ ${#FAILED_SERVICES} !=0 ]]

Yes, space before 0

You can use any of this code.  Anything else?
0
 
detox1978Author Commented:
When i manually run it, it works perfect.

But when i do it via the cron job it passes the service names incorrectly and get emails like this;

Services failed:  mysqld named postfix


even though the services are running.  I guess this is because it is looking for a service called "mysqld named postfix" rather than parsing them one at a time?
0
 
detox1978Author Commented:
It is sending three emails with the body of;

1# Services failed:  named mysqld postfix
2# Services failed:  named mysqld
3# Services failed:  named

I'm guess its not splitting the services names correctly
0
 
farzanjCommented:
Here are the problems.

First, email:  The reason I had a separate section for emails is because I wanted to finalize the status of all the services and then send one consolidated email.

Second:
>  I guess this is because it is looking for a service called "mysqld named postfix" rather than parsing them one at a time?
No, when it runs without cron, it should run with cron EXCEPT for perhaps the path issues and permission issues.  So let me try one more time.  See how it goes.
 
#!/bin/bash
PATH=$PATH:/sbin
SERVICES="mysqld named postfix"
EMAIL_ADDR="admin@mydomain.com second@yahoo.com"
FAILED_SERVICES=""

#Finding services that are not running
for service in $SERVICES
do
   if (( $(service $service status | grep -c running) == 0 ))
   then
        #Attempt to restart the service
        service $service restart
        if [[ $? != 0 ]]
        then
            FAILED_SERVICES=$FAILED_SERVICES" $service"
        fi
    fi
done

#Emailing about failure
if [[ ${#FAILED_SERVICES} != 0 ]]
then
     echo "Services failed: $FAILED_SERVICES" | mail -s "Services NOT running" $EMAIL_ADDR
fi

Open in new window


Did you cron it as root user?  Please cron as root and please cron it as follows:
 
*/3 * * * * /root/monitorservices.sh > /root/error.txt 2>&1

Open in new window


If it doesn't work out, I want to see the contents of the error file.
0
 
Kerem ERSOYPresidentCommented:
Why would you work this hard to correct a non working script ?  I've already sent you a simpler script which works 100% ???

 
0
 
Kerem ERSOYPresidentCommented:
If you don't care for a working script what are you trying to accomplish ?
0
 
detox1978Author Commented:
you didn't reply;

"KeremE, yours sends an email every time.  I need it to only email when a service is not running."
0
 
Kerem ERSOYPresidentCommented:
Here's your code with e-mail only when tehre's a service failure:

#!/bin/bash
RECIPIENTS="kerem@sibernet.com.tr"

for i in  "mysqld" "named" "postfix"; do
  if /sbin/service $i status 2>/dev/null | grep "stop" > /dev/null  
  then      
     echo -n $i stopped attemping restart:
     if /sbin/service $i start 2>&1 > /dev/null
     then 
        echo $i restarted
     else
        echo $i could not restart!!
     fi
  fi   
done > /tmp/mail

if [ -s /tmp/mail ]
then 
( echo "Problems:"; cat /tmp/mail ) | mail -s "Service problems detected !!!" $RECIPIENTS
fi

Open in new window

0
 
Kerem ERSOYPresidentCommented:
Some shorter version would be:

#!/bin/bash
RECIPIENTS="rec1@example.com rec2@example.com"

for i in  "mysqld" "postfix"; do        
  if /sbin/service $i status 2>/dev/null | grep "stop" > /dev/null  
  then      
     echo -n $i "stopped attemping restart: "
     if /sbin/service $i start 2>&1 > /dev/null
     then 
        echo $i restarted
     else
        echo $i failed to restart!!
     fi
  fi   
done > /tmp/mail

test  -s /tmp/mail && ( echo "Problems:"; cat /tmp/mail ) | mail -s "Service problems detected !!!" $RECIPIENTS

Open in new window

0
 
Kerem ERSOYPresidentCommented:
we can further eliminate the grep:

#!/bin/bash
RECIPIENTS="rec1@example.com rec2@example.com"

for i in  "mysqld" "named" "postfix"; do
  if ! /sbin/service $i status 2>&1 >/dev/null  
  then      
     echo -n $i " service stopped attemping restart: "
     if /sbin/service $i start 2>&1 >/dev/null 
     then 
        echo $i restarted
     else
        echo $i failed to restart!!
     fi
  fi   
done  > /tmp/mail 

test -s /tmp/mail && ( echo "Problems:"; /bin/cat /tmp/mail ) | /bin/mail -s "Service problems detected !!!" $RECIPIENTS

Open in new window

0
 
detox1978Author Commented:
That works.  however the named service always returns the following message;

rndc: no server specified and no default

so it will alway email me.  Is there a way around this?
0
 
farzanjCommented:
Is it my code or KeremE's

Did you try out my last code in cron?

Did you try KeremE's code in cron?
0
 
detox1978Author Commented:
That was for KeremE.

Yours errored again with the same issue.
0
 
farzanjCommented:
Do you have output of the error file?  Did it error out in cron or without cron?
0
 
Kerem ERSOYPresidentCommented:
> That works.  however the named service always returns the following message;

> rndc: no server specified and no default

> so it will alway email me.  Is there a way around this?

it shouldn't are you sure that you did not omit " 2>&1 >/dev/null " after each service commnad ??
Can you recopy the last version and retry.. These are to contain this rndc error..

0
 
detox1978Author Commented:
KeremE, recopied it and still get an email everytime it's run.  The content of the email is "Problems: rndc: no server specified and no default"

farzanj, I'll re run it and check for the error file.
0
 
Kerem ERSOYPresidentCommented:
In fact the rndc error is output to the stderr and the command output to stdout. This is why I redirect them both to /dev/null and I use only the exit status instead of depending on the text inside it. This is why it should not happen. Please try my latest version HERE.



0
 
detox1978Author Commented:
That code sends an email every time it is run.
0
 
Kerem ERSOYPresidentCommented:
Are you running the file as root ?? If you've run the file as root the first time from the command line as root then you've started it from cron as another user then you might not able to override the file. Please try to remove the file (/tmp/mail) manually before the cron job runs.

0
 
detox1978Author Commented:
KeremE, yes i am running it as root.  I've not tested it as a cron job because it emails everytime.

farzanj, it now isn't sending emails.
0
 
Kerem ERSOYPresidentCommented:
Ok but the thing is are you sure your named server could be started from the command line ?? It seems that named configuration missing rndc fie which should be iin /etc/rndc.conf.

0
 
Kerem ERSOYPresidentCommented:
I mean your named.conf is missing the rndc info. Please check your named.conf. It should be referencing a non-existing  rndc key file (wihich is /etc/rndc.key) by default. But it seesm that you2ve reconfigured your named.conf.

will you replace the

#/bin/bash

in the first line with

#!/bin/bash -x

and please post the output here.
0
 
detox1978Author Commented:
+ RECIPIENTS=detox1978@yahoo.co.uk
+ for i in '"mysqld"' '"named"' '"postfix"'
+ /sbin/service mysqld status
+ for i in '"mysqld"' '"named"' '"postfix"'
+ /sbin/service named status
+ for i in '"mysqld"' '"named"' '"postfix"'
+ /sbin/service postfix status
+ test -s /tmp/mail
+ /bin/mail -s 'Service problems detected !!!' detox1978@yahoo.co.uk
+ echo Problems:
+ /bin/cat /tmp/mail
0
 
Kerem ERSOYPresidentCommented:
Ok I've got it there was a problem with redirection. I've redirected stderr before I've redirected stdout.. The ordering problem. Please use this code instead:

#!/bin/bash
RECIPIENTS="rec1@example.com rec2@example.com"

for i in  "mysqld" "named" "postfix"; do
  if ! /sbin/service $i status >/dev/null 2>&1 
  then      
     echo -n $i " service stopped attemping restart: "
     if /sbin/service $i start >/dev/null 2>&1 
     then 
        echo $i restarted
     else
        echo $i failed to restart!!
     fi
  fi   
done  > /tmp/mail 

test -s /tmp/mail && ( echo "Problems:"; /bin/cat /tmp/mail ) | /bin/mail -s "Service problems detected !!!" $RECIPIENTS

Open in new window

0
 
Kerem ERSOYPresidentCommented:
But you've still got the rndc error. Please fix it using the advice here:

http://serverfault.com/questions/231749/dns-on-redhat-rdnc-no-server-specified-and-no-default

0
 
detox1978Author Commented:
The service monitor now works.


i tried to fix the rcdn, but it now says "rndc: decode base64 secret: bad base64 encoding"
0
 
detox1978Author Commented:
its ok, i sorted it, I'd made a typo.
0
 
Kerem ERSOYPresidentCommented:
:)

You know this is kind of off-topic but:

http://forum.parallels.com/showthread.php?t=87083

If you need further assistance I strongly suggest you to close this question and start another tread.

Cheers,
K.
 
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 18
  • 16
  • 11
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now