Solved

oracle alert log script to check for errors

Posted on 2014-04-07
7
3,679 Views
Last Modified: 2014-04-07
I have several databases on several machines
unix - aix - oracle 10 & 11
I need a script to run in cron that will check each alert log for oracle errors and email me if there are any errors, what they are and what database/system they come from
0
Comment
Question by:bkreynolds48
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 16

Expert Comment

by:Wasim Akram Shaik
ID: 39983527
Check burleson site, this has the script which you are looking for


http://www.dba-oracle.com/t_linux_shell_script_check_alert_log.htm
0
 
LVL 23

Expert Comment

by:David
ID: 39983593
bk, I read into your requirements that you want one script, on one host, to query multiple systems.  As Burleson mentions, you need to be looking at Oracle Grid Control if that's accurate.  One overall job scheduling / system monitoring tool, with agents on each managed host.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 39983600
Alternatively, you could set up your alert log as an external table and query it from each instance.  Excellent example of that here:

http://www.adp-gmbh.ch/ora/admin/scripts/read_alert_log.html

You could then schedule a job to query that table and send you output periodically if anything new appears that you aren't specifically excluding, and if you get new messages that you decide are OK, you can go back to the scripting and exclude them as well.
0
SharePoint Admin?

Enable Your Employees To Focus On The Core With Intuitive Onscreen Guidance That is With You At The Moment of Need.

 
LVL 1

Author Comment

by:bkreynolds48
ID: 39983766
wasimibm

I looked at his script - I have one pretty much the same
so my only option for multiple databases - lets just say on one system - is to duplicate the script for each of the databases I have?  there does not appear to be any loop to grap the database names from the oratab file

Steve Wales - I can't make that kind of change on production systems.
0
 
LVL 22

Accepted Solution

by:
Steve Wales earned 500 total points
ID: 39983800
Here's a script I wrote ages ago that would do that.

#!/usr/bin/ksh
#set -x
#
# Description:
# This script examines Oracle alert logs for each instance specified in a
# server's /etc/oratab file and reports on errors.  This script can run
# at whatever frequency the DBA desires, setup in crontab.  In conjunction
# with this script, if you have a script that ages alert logs, at the time 
# of the alert log aging, the temporary file that keeps track of where you
# where in the file needs to have it's ling counter reset to 0.
#
# Usage:
# oracle_alert_scanner.sh
# 
# Revision History
# 02-24-2005 
# ............ Initial Coding
# 01-10-2006 
# ............ Added ^M at the end of each line to prevent line wrap

# Define a function for processing any errors that have been reported
# so that it doesn't clutter up the errors a little further down the 
# script

process_errors() 
{

# Define useful variables

ERRORS=/tmp/oracle_alert_scanner.tmp
OUTFILE=/tmp/oracle_alert_email
FILE=my_alert_$ORACLE_SID.tracker
LAST=`cat $FILE`

# Create an empty output file to email to DBA's after processing is 
# done if there are any recent errors.

> $OUTFILE

# Start processing the errorfile.

cat $ERRORS | while read ERR
do 
  ERRLIN=`echo $ERR | cut -d: -f1`

  if [ $ERRLIN -gt $LAST ]
  then
    echo $ERR >> $OUTFILE
    echo ^M >> $OUTFILE   #This is a Control-M character
  fi
done

# If $OUTFILE actually ended up having anything written to it, we need
# to email it to the DBA's

if [ -s "$OUTFILE" ]
then

# Create an empty temporary file

  > /tmp/$$.1
  HOSTNAME=`hostname`
  echo "Database $ORACLE_SID on Host $HOSTNAME has reported the " >> /tmp/$$.1
  echo "following errors, please review ASAP" >> /tmp/$$.1
  echo " " >> /tmp/$$.1

  cp $OUTFILE /tmp/$$.2

  cat /tmp/$$.1 /tmp/$$.2 > $OUTFILE

  cat $OUTFILE | mailx -s "Errors detected in Alert Log for $ORACLE_SID" $MAILLIST
fi

# Cleanup temp files before exit.

rm $OUTFILE > /dev/null
rm /tmp/$$.1 > /dev/null 2>&1
rm /tmp/$$.2 > /dev/null 2>&1

}


MAILLIST="dba@mycompany.com"

# Process /etc/oratab for each instance that is defined to autostart.  

cat /etc/oratab | while read LINE
do
  case $LINE in
  \#*) # A comment line in oratab
    ;;
  *)  
     if [ "$LINE" ]         # If this is not a blank line....
     then 
       if [ "`echo $LINE | awk -F: '{print $3}' -`" = "Y" ]
       then
	 ORACLE_SID=`echo $LINE | awk -F: '{print $1}' -`
	 if [ "$ORACLE_SID" = "*" ]
	 then
	   ORACLE_SID=""
         fi
         export ORACLE_SID
	 ORACLE_HOME=`echo $LINE | awk -F: '{print $2}' -`
         export ORACLE_HOME
         PATH=$PATH:$ORACLE_HOME/bin ; export PATH

         cd /oracle/admin/$ORACLE_SID/bdump

# In each oracle admin area, in the alert_log directory, we'll create
# a tracker file to keep track of the last line number processed.
# This is also reset as a part of the oracle_cleanup.sh script that 
# runs once a week.  If it doesn't exist, create it.

	 FILE=my_alert_$ORACLE_SID.tracker
         LOG=alert_$ORACLE_SID.log

# If there is an alert log present (test databases may not have one...)

	 if [ -s "$LOG" ]
	 then

           if [ -f "$FILE" ]
           then 
             LAST=`cat $FILE`
           else
             LAST=0
             echo '0' > $FILE
           fi

# At this point, $LAST contains the number of the last line processed
# last time in the alert log for this SID.
# Now we need to find out how many lines are in the file at the moment

           CURRENT=`cat $LOG | wc -l`

# if CURRENT > LAST, then we have more data in the alert log than the
# last time we checked, so we'll continue processing, otherwise we 
# are done with this iteration of the loop.
	 
           if [ "$CURRENT" -gt "$LAST" ]
           then
             errcnt=`cat $LOG | grep -v ORA-000060 | egrep -c 'ORA-|crash|Error|abort|cannot allocate|not complete|waiting|All online logs needed archiving|Shutting down|ALTER DATABASE CLOSE|alter database close|Starting|FULL|kccrsz|not  valid|terminate|Failed|error|Archival stopped'` 

             if [ errcnt -gt 0 ]
             then
               cat $LOG | grep -v ORA-000060 | egrep -n 'ORA-|crash|Error|abort|cannot allocate|not complete|waiting|All online logs needed archiving|Shutting down|ALTER DATABASE CLOSE|alter database close|Starting|FULL|kccrsz|not  valid|terminate|Failed|error|Archival stopped' > /tmp/oracle_alert_scanner.tmp
               process_errors
               echo $CURRENT > $FILE
             fi
           fi
         fi
       fi
     fi
    ;;
  esac
done

Open in new window


This was originally written against Oracle 9 databases.

Lots of things you could do to tweak it, but it's a possible start.   For 11g, for example, you'd need a different location for your alert logs.

It keeps track of the last line processed in a little temp file and then only checks from there to end of file each time it runs.

I had another script that aged / rotated the alert log each week and as a part of that it rewrote the tracker file with 0 in it.
0
 
LVL 16

Expert Comment

by:Wasim Akram Shaik
ID: 39983821
you could refer to one of sites which has a scenario where in there are multiple databases hosted on a single system.. I hope this would suffice your requirement to monitor multiple databases by querying /etc/oratab. he had done it so via some file

for SID in `cat $ORACLE_HOME/sidlist`

check this link for further details
https://communities.bmc.com/docs/DOC-9942#alertlog
0
 
LVL 1

Author Closing Comment

by:bkreynolds48
ID: 39983828
Steve,

thanks I will give it a try
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Attention: This article will no longer be maintained. If you have any questions, please feel free to mail me. jgh@FreeBSD.org Please see http://www.freebsd.org/doc/en_US.ISO8859-1/articles/freebsd-update-server/ for the updated article. It is avail…
I remember the day when someone asked me to create a user for an application developement. The user should be able to create views and materialized views and, so, I used the following syntax: (CODE) This way, I guessed, I would ensure that use…
This video shows how to recover a database from a user managed backup
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

710 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question