Cron job failure monitoring

Hi,

We would like to monitor all our system and application cron jobs running in our HPUX servers. We have hundreds of servers and would be difficult to modify all the running cron jobs for failures.

We would like to ask for a script to monitor and send an alert thru email to admin@company.com if a certain cron job failed. The cron job logs are redirected to /var/adm/cron/logs .
oo_tatangAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

omarfaridCommented:
you could do the following as a cron job that runs say every min:

n=`/usr/bin/grep 'error message' /var/adm/cron/logs | /usr/bin/wc -l`
if [ $n -gt 0 ]
then
      /usr/bin/mailx -s "errors in logs file" username@domain.com
fi
0
Gerwin Jansen, EE MVETopic Advisor Commented:
do you want to get notified when cron jobs are not running properly e.g. crash or do you want errors  reported like omarfarid is suggesting?
0
oo_tatangAuthor Commented:
Hi I wanted to be notified when cron jobs are not running and what's the name of the cronjob and from what server.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
CompTIA Network+

Prepare for the CompTIA Network+ exam by learning how to troubleshoot, configure, and manage both wired and wireless networks.

omarfaridCommented:
The solution provided earlier will send you email if there are errors.

Now, how do you decide that cron jobs are not running?

To schedule a cron job, do the following:

- put the commands provided to you into a file using an editor like vi, or by simply running:

echo "n=`/usr/bin/grep 'error message' /var/adm/cron/logs | /usr/bin/wc -l`
if [ $n -gt 0 ]
then
      /usr/bin/mailx -s "errors in logs file" username@domain.com
fi" > /path/to/mydir/myscript

- make the script executable with command

chmod +x /path/to/mydir/myscript

- add the script to crontab by running

crontab -l > mycron
echo '* * * * * /path/to/mydir/myscript' >> mcron
crontab mycron

This will run the script every minute for you.

If you want different schedule, then please look at link below:

http://www.pantz.org/software/cron/croninfo.html
0
oo_tatangAuthor Commented:
Hi usually the output of the /var/adm/cron/log has the error of the below:

CMD: /usr1/oracle10/admin/scripts/trace.sh 2>&1 > /dev/null
>  oracle10 8133 c Tue Nov 20 22:40:00 EST 2012
<  oracle10 8131 c Tue Nov 20 22:40:00 EST 2012 rc=126
<  oracle10 8133 c Tue Nov 20 22:40:00 EST 2012 rc=127

with rc=126 and rc=127  a non zero output means the cron job failed.
How can I specify this in the script and how will i know that this error that we came across is from the /usr1/oracle10/admin/scripts/trace.sh ? it will send me an alert that this cron command script failed to run
0
omarfaridCommented:
do you run other scripts from the same user and have errors in running?

You can monitor the scripts running by:

- let the monitoring script to create a file at a specific directory
- let your other script run via cron at at the end of the job delete the file created by the monitoring job
- when the monitoring job run again, it will check it the file exits or not, if still there then the job failed to run since the file was not removed. If there is not file then it was removed by the job and then the monitoring script creates the file and assume successful run of the job.

if [ -f /path/to/mydir/myjobfile ]
then
      /usr/bin/mailx -s "job failed to run" username@domain.com
elif
    /usr/bin/touch /path/to/mydir/myjobfile
fi

The above is simple logic to check if cron job failed to run.

Please note that timing is important here, the cron job should not run slower than the monitoring cron job
0
gheistCommented:
cron on hp-uz actually sends mail to root on every system and logs to syslog when something runs or fails.

You need centralized scheduler to detect that system is off and cron job did not run while it was.... And there is anacron to run them after it comes up...
0
oo_tatangAuthor Commented:
We are running 10 cron jobs with  5 different users such as root oracle adm etc in 1 server. All the cron logs for all of these jobs are logged into /var/adm/cron/log .
I want to be alerted if a job failed and what cron script fail so support team is alerted and fix the script or whatever the issue
0
gheistCommented:
/etc/syslog.conf
cron.crit |mail yourself@gmail.com
0
oo_tatangAuthor Commented:
no reason
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Shell Scripting

From novice to tech pro — start learning today.